Many language tasks (e.g., Named Entity Recognition, Part-of-Speech tagging, and Semantic Role Labeling) are naturally framed as sequence tagging problems. However, there has been comparatively little work on interpretability methods for sequence tagging models. In this paper, we extend influence functions - which aim to trace predictions back to the training points that informed them - to sequence tagging tasks. We define the influence of a training instance segment as the effect that perturbing the labels within this segment has on a test segment level prediction. We provide an efficient approximation to compute this, and show that it tracks with the true segment influence, measured empirically. We show the practical utility of segment influence by using the method to identify systematic annotation errors in two named entity recognition corpora. Code to reproduce our results is available at https://github.com/successar/Segment_Influence_Functions.
翻译:许多语言任务(例如,命名实体识别、部分语音标记和语义角色标签)自然地被框为序列标记问题。然而,在序列标记模型的可解释性方法方面,我们相对没有做多少工作。在本文中,我们扩展了影响功能,旨在将预测追溯到告知它们的训练点,以追踪标记任务的顺序。我们将培训实例部分的影响定义为在这个部分内调试标签对测试段级预测的影响。我们提供了一种有效的近似,以计算这一效果,并显示它与真实的段段影响(通过经验衡量)进行跟踪。我们通过使用在两个命名实体识别公司中识别系统说明错误的方法,显示了段影响的实际效用。我们复制结果的代码可在https://github.com/sucessar/Segment_Iimgdal_Functions查阅。