We consider the problem of identifying a minimal subset of training data $\mathcal{S}_t$ such that if the instances comprising $\mathcal{S}_t$ had been removed prior to training, the categorization of a given test point $x_t$ would have been different. Identifying such a set may be of interest for a few reasons. First, the cardinality of $\mathcal{S}_t$ provides a measure of robustness (if $|\mathcal{S}_t|$ is small for $x_t$, we might be less confident in the corresponding prediction), which we show is correlated with but complementary to predicted probabilities. Second, interrogation of $\mathcal{S}_t$ may provide a novel mechanism for contesting a particular model prediction: If one can make the case that the points in $\mathcal{S}_t$ are wrongly labeled or irrelevant, this may argue for overturning the associated prediction. Identifying $\mathcal{S}_t$ via brute-force is intractable. We propose comparatively fast approximation methods to find $\mathcal{S}_t$ based on influence functions, and find that -- for simple convex text classification models -- these approaches can often successfully identify relatively small sets of training examples which, if removed, would flip the prediction.
翻译:我们考虑的是确定一个最低限度的培训数据子集 $mathcal{S ⁇ t$的问题,因此,如果在培训前删除了由$mathcal{S ⁇ t$构成的情况,那么对某个测试点的分类就会不同。确定这样一个组可能出于一些原因引起兴趣。首先,$mathcal{S ⁇ t$的基数提供了一种稳健度的度量(如果$mathcal{S ⁇ t$是小的,那么我们对相应的预测可能不太有信心),我们所显示的情况与预测的概率相关,但与预测的概率是互补的。第二,对$\mathcal{S ⁇ t$的质询可能为质疑某个特定模型预测提供了一种新颖的机制:如果人们能够证明$mathcal{S ⁇ t$的基数是错误的标签或不相干,这可能会证明相关的预测值过高。通过布鲁特力确定 $mathcal{S ⁇ t$,我们建议比较快速的近度方法来找到美元-mathcalalalal asal asal siquest as regle press press as press press press press press press press maiss macild the settycle) roget macreget max magipeacts press press press press press mais press macal max se se se macal matipeal macal max macal max max max max max max max max max max max max macal macal comp comp comp comp comp comp comp compal exple) compal compal compal practs comp compal compal practs practs practs practs compal compal compal mas mas se ses mas ma