FastIF: 高效模型解释和调试的可缩放影响函数 (FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging)

Influence functions approximate the "influences" of training data-points for test predictions and have a wide variety of applications. Despite the popularity, their computational cost does not scale well with model and training data size. We present FastIF, a set of simple modifications to influence functions that significantly improves their run-time. We use k-Nearest Neighbors (kNN) to narrow the search space down to a subset of good candidate data points, identify the configurations that best balance the speed-quality trade-off in estimating the inverse Hessian-vector product, and introduce a fast parallel variant. Our proposed method achieves about 80X speedup while being highly correlated with the original influence values. With the availability of the fast influence functions, we demonstrate their usefulness in four applications. First, we examine whether influential data-points can "explain" test time behavior using the framework of simulatability. Second, we visualize the influence interactions between training and test data-points. Third, we show that we can correct model errors by additional fine-tuning on certain influential data-points, improving the accuracy of a trained MultiNLI model by 2.5% on the HANS dataset. Finally, we experiment with a similar setup but fine-tuning on datapoints not seen during training, improving the model accuracy by 2.8% and 1.7% on HANS and ANLI datasets respectively. Overall, our fast influence functions can be efficiently applied to large models and datasets, and our experiments demonstrate the potential of influence functions in model interpretation and correcting model errors. Code is available at https://github.com/salesforce/fast-influence-functions

翻译：影响函数的“ 影响 ” 接近用于测试预测的培训数据点的“ 影响 ”, 并有多种应用。尽管受欢迎度很高, 但其计算成本与模型和培训数据大小不相称。我们展示了快速IF, 这是一套简单的修改, 影响功能, 大大改进运行时间的功能。我们使用 k- Nearest Neighbors (kNNN) 将搜索空间缩小到一组良好的候选数据点, 确定在估计逆向赫斯维特产品时最平衡速度质量交易的配置, 并引入一个快速平行变量。我们提议的方法在与原始影响值高度关联的情况下实现了大约80X速度的加速。随着快速影响功能的可用性, 我们在四个应用程序中展示了这些功能的有用性。首先, 我们检查有影响力的数据点是否可以使用模缩缩框架来“ 解释” 测试时间行为。其次, 我们可以将模型与测试NS 模型和测试数据点之间的相互作用进行视觉。第三, 我们显示我们可以通过对某些有影响力的数据点进行进一步的微调来纠正模型错误, 大幅的解算算算算, 。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

专知会员服务

6+阅读 · 2019年12月1日