With a rise in false, inaccurate, and misleading information in propaganda, news, and social media, real-world Question Answering (QA) systems face the challenges of synthesizing and reasoning over contradicting information to derive correct answers. This urgency gives rise to the need to make QA systems robust to misinformation, a topic previously unexplored. We study the risk of misinformation to QA models by investigating the behavior of the QA model under contradicting contexts that are mixed with both real and fake information. We create the first large-scale dataset for this problem, namely Contra-QA, which contains over 10K human-written and model-generated contradicting pairs of contexts. Experiments show that QA models are vulnerable under contradicting contexts brought by misinformation. To defend against such a threat, we build a misinformation-aware QA system as a counter-measure that integrates question answering and misinformation detection in a joint fashion.
翻译:随着宣传、新闻和社交媒体中虚假、不准确和误导信息的上升,现实世界问答系统面临着综合和推理矛盾信息以获得正确答案的挑战。这一紧迫性导致有必要使质量评估系统对错误信息(以前尚未探讨的题目)具有强大性;我们通过调查质量评估模式在与真实和假信息混杂的相互矛盾背景下的行为,研究向质量评估模式提供错误信息的风险。我们为这一问题创建了第一个大规模数据集,即Contra-QA, 其中包含10K 人写和模型生成的矛盾环境对立的对子。实验表明,在错误信息带来的矛盾环境中,QA模式是脆弱的。为了防范这种威胁,我们建立了一个识别错误的质量保证系统,作为将问题回答和错误识别联合方式相结合的应对措施。