Pre-trained language models have achieved human-level performance on many Machine Reading Comprehension (MRC) tasks, but it remains unclear whether these models truly understand language or answer questions by exploiting statistical biases in datasets. Here, we demonstrate a simple yet effective method to attack MRC models and reveal the statistical biases in these models. We apply the method to the RACE dataset, for which the answer to each MRC question is selected from 4 options. It is found that several pre-trained language models, including BERT, ALBERT, and RoBERTa, show consistent preference to some options, even when these options are irrelevant to the question. When interfered by these irrelevant options, the performance of MRC models can be reduced from human-level performance to the chance-level performance. Human readers, however, are not clearly affected by these irrelevant options. Finally, we propose an augmented training method that can greatly reduce models' statistical biases.
翻译:经过培训的语文模式在许多机器阅读理解(MRC)任务中取得了人文层面的成绩,但目前还不清楚这些模式是否真正理解语言,或者通过利用数据集中的统计偏差回答问题。在这里,我们展示了一种简单而有效的方法来攻击MRC模型,并揭示了这些模型中的统计偏差。我们将这些方法应用于RACE数据集,对每个MRC问题的答案从四个选项中挑选出来。发现一些经过培训的语文模式,包括BERT、ALBERT和RoBERTA, 显示一贯偏爱某些选项,即使这些选项与问题无关。如果受到这些不相关选项的干扰,MRC模型的性能可以从人文层面的性能降低到机会层面的性能。然而,人类读者并没有明显受到这些不相干的选项的影响。最后,我们建议了一种强化的培训方法,可以极大地减少模型的统计偏差。