Extractive question answering (QA) models tend to exploit spurious correlations to make predictions when a training set has unintended biases. This tendency results in models not being generalizable to examples where the correlations do not hold. Determining the spurious correlations QA models can exploit is crucial in building generalizable QA models in real-world applications; moreover, a method needs to be developed that prevents these models from learning the spurious correlations even when a training set is biased. In this study, we discovered that the relative position of an answer, which is defined as the relative distance from an answer span to the closest question-context overlap word, can be exploited by QA models as superficial cues for making predictions. Specifically, we find that when the relative positions in a training set are biased, the performance on examples with relative positions unseen during training is significantly degraded. To mitigate the performance degradation for unseen relative positions, we propose an ensemble-based debiasing method that does not require prior knowledge about the distribution of relative positions. We demonstrate that the proposed method mitigates the models' reliance on relative positions using the biased and full SQuAD dataset. We hope that this study can help enhance the generalization ability of QA models in real-world applications.
翻译:解答(QA)模式往往会利用虚假的关联,在培训组合出现意外偏差时作出预测。这种趋势导致模型无法被广泛推广,无法被相关模式所维持的范例所利用。确定虚假的关联性QA模式对于在现实世界应用中建立通用的QA模式至关重要;此外,需要开发一种方法,防止这些模型在培训组合存在偏差的情况下学习虚假的关联性;在这项研究中,我们发现一个答案的相对位置,即从一个答案到最接近问题文本重叠的单词的相对距离,可以被QA模式用作预测的肤浅提示。具体地说,我们发现,当培训组合中的相对位置偏差时,培训过程中看不到的相对位置实例的性能就会大大降低。为了减轻看不见相对位置的性能退化,我们建议一种基于大量偏差的方法,不需要事先了解相对位置的分布。我们表明,拟议的方法可以减轻模型对使用真实和完整能力模型的相对位置的依赖性。我们希望,我们可以通过SAAD数据库的帮助研究来提高这个模型的应用。