Explaining the predictions of neural black-box models is an important problem, especially when such models are used in applications where user trust is crucial. Estimating the influence of training examples on a learned neural model's behavior allows us to identify training examples most responsible for a given prediction and, therefore, to faithfully explain the output of a black-box model. The most generally applicable existing method is based on influence functions, which scale poorly for larger sample sizes and models. We propose gradient rollback, a general approach for influence estimation, applicable to neural models where each parameter update step during gradient descent touches a smaller number of parameters, even if the overall number of parameters is large. Neural matrix factorization models trained with gradient descent are part of this model class. These models are popular and have found a wide range of applications in industry. Especially knowledge graph embedding methods, which belong to this class, are used extensively. We show that gradient rollback is highly efficient at both training and test time. Moreover, we show theoretically that the difference between gradient rollback's influence approximation and the true influence on a model's behavior is smaller than known bounds on the stability of stochastic gradient descent. This establishes that gradient rollback is robustly estimating example influence. We also conduct experiments which show that gradient rollback provides faithful explanations for knowledge base completion and recommender datasets.
翻译:解释神经黑盒模型的预测是一个重要问题,特别是在用户信任至关重要的应用应用中使用这类模型时,就是一个重要问题。估计培训范例对学习神经模型行为的影响,使我们能够确定对特定预测负有最大责任的培训范例,从而忠实地解释黑盒模型的输出。最普遍适用的现有方法基于影响功能,对于较大的样本大小和模型来说规模不高。我们建议了梯度回滚,这是影响估计的一般方法,适用于神经模型,在梯度下降期间每个参数更新步骤都触及较少的参数,即使参数总数很大。用梯度下降训练的神经矩阵乘数模型是这一模型类别的一部分。这些模型很受欢迎,在行业中发现了广泛的应用。特别是属于这一类的知识图形嵌入方法被广泛使用。我们显示,在培训和测试时间,梯度回滚动是效率很高的。此外,我们从理论上看,梯度回滚动对模型行为的影响和对模型行为的真正影响之间的差别比已知的要小,即使参数的总数很大。用梯度递增指数模型模型的缩缩缩缩度模型,我们还可以推推推推推推推推推推推的梯度,从而推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推,推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推推