In attempts to develop sample-efficient and interpretable algorithms, researcher have explored myriad mechanisms for collecting and exploiting feature feedback (or rationales) auxiliary annotations provided for training (but not test) instances that highlight salient evidence. Examples include bounding boxes around objects and salient spans in text. Despite its intuitive appeal, feature feedback has not delivered significant gains in practical problems as assessed on iid holdout sets. However, recent works on counterfactually augmented data suggest an alternative benefit of supplemental annotations, beyond interpretability: lessening sensitivity to spurious patterns and consequently delivering gains in out-of-domain evaluations. We speculate that while existing methods for incorporating feature feedback have delivered negligible in-sample performance gains, they may nevertheless provide out-of-domain benefits. Our experiments addressing sentiment analysis, show that feature feedback methods perform significantly better on various natural out-of-domain datasets despite comparable in-domain evaluations. By contrast, performance on natural language inference remains comparable. Finally, we compare those tasks where feature feedback does (and does not) help.
翻译:在试图开发具有抽样效率和可解释的算法时,研究人员探索了各种机制,以收集和利用为培训(而不是测试)提供的突出证据的特征反馈(或理由)辅助说明,例如围绕对象和文字中突出的跨度的框框;尽管其直观的吸引力,但特征反馈并没有在根据iid holdout 数据集评估的实际问题方面取得重大收益;然而,最近关于反事实增加的数据的研究表明,补充说明的替代好处超越了可解释性:降低对虚假模式的敏感性,从而在外部评价中实现收益。我们推测,虽然现有的纳入特征反馈的方法带来了微不足道的全方位业绩收益,但它们仍可能带来外部效益。我们关于情绪分析的实验表明,尽管对各种自然的外部数据集进行了可比较的评价,但特征反馈方法在各种自然的外部数据集上效果要好得多。相比之下,自然语言的推断性能仍然具有可比性。最后,我们比较了特征反馈确实(但)没有起到帮助作用的那些任务。