The behaviors of deep neural networks (DNNs) are notoriously resistant to human interpretations. In this paper, we propose Hypergradient Data Relevance Analysis, or HYDRA, which interprets the predictions made by DNNs as effects of their training data. Existing approaches generally estimate data contributions around the final model parameters and ignore how the training data shape the optimization trajectory. By unrolling the hypergradient of test loss w.r.t. the weights of training data, HYDRA assesses the contribution of training data toward test data points throughout the training trajectory. In order to accelerate computation, we remove the Hessian from the calculation and prove that, under moderate conditions, the approximation error is bounded. Corroborating this theoretical claim, empirical results indicate the error is indeed small. In addition, we quantitatively demonstrate that HYDRA outperforms influence functions in accurately estimating data contribution and detecting noisy data labels. The source code is available at https://github.com/cyyever/aaai_hydra_8686.
翻译:深神经网络(DNNS)的行为是对人类解释的强烈抵制。 在本文中,我们提出超梯度数据相关性分析,即九头蛇,将DNS的预测解释为其培训数据的效果。现有方法一般估计最终模型参数的数据贡献,忽视培训数据如何影响优化轨道。通过释放测试损失的高度梯度,HYDRA评估培训数据在整个培训轨迹中对测试数据点的贡献。为了加快计算,我们将Hesian人从计算中除名,并证明在中度条件下近似错误是受约束的。对这项理论主张进行校验,实验结果表明错误确实很小。此外,我们量化地证明,HYDRA在准确估算数据贡献和探测噪音数据标签方面超越了功能。源代码见https://github.com/cyyever/aai_hyya_866。