To explain NLP models a popular approach is to use importance measures, such as attention, which inform input tokens are important for making a prediction. However, an open question is how well these explanations accurately reflect a model's logic, a property called faithfulness. To answer this question, we propose Recursive ROAR, a new faithfulness metric. This works by recursively masking allegedly important tokens and then retraining the model. The principle is that this should result in worse model performance compared to masking random tokens. The result is a performance curve given a masking-ratio. Furthermore, we propose a summarizing metric using relative area-between-curves (RACU), which allows for easy comparison across papers, models, and tasks. We evaluate 4 different importance measures on 8 different datasets, using both LSTM-attention models and RoBERTa models. We find that the faithfulness of importance measures is both model-dependent and task-dependent. This conclusion contradicts previous evaluations in both computer vision and faithfulness of attention literature.
翻译:为了解释NLP模式,流行的方法是使用重要措施,例如注意,告知输入符号对于预测很重要。然而,一个未决问题是,这些解释如何准确地反映模型的逻辑,一种称为忠诚的属性。为了回答这个问题,我们提议采用新的忠诚度量标准,即Recurive 注重成果的年度报告,这是一种新的忠诚度度度量标准。这是通过反复掩盖所谓重要标志,然后再对模型进行再培训的方法。原则是,这应该产生比掩码随机标志更差的模型性能。结果是一个表现曲线,给它一个掩码。此外,我们建议使用相对区域之间的曲线(RACU)来总结衡量尺度,以便方便地比较各种文件、模型和任务。我们用LSTM注意模型和ROBERTA模型对8个不同的数据集评估4项不同的重要性度量。我们发现,重要措施的忠诚性既取决于模型,又取决于任务。这一结论与以前对计算机视觉和关注文献的忠诚性的评价相矛盾。