Understanding Blackbox Prediction via Influence Functions

Why did the system make this prediction? How can we explain where the model came from?

Video

Slide

In this paper, they tackle this question by tracing a model’s predictions through its learning algorithm and back to the training data, where the model parameters ultimately derive from.

They use influence functions, a classic technique from robust statistics (Cook & Weisberg, 1980) that tells us how the model parameters change as we upweight a training point by an infinitesimal amount.