Recent years have witnessed impressive advances in challenging multi-hop QA tasks. However, these QA models may fail when faced with some disturbance in the input text and their interpretability for conducting multi-hop reasoning remains uncertain. Previous adversarial attack works usually edit the whole question sentence, which has limited effect on testing the entity-based multi-hop inference ability. In this paper, we propose a multi-hop reasoning chain based adversarial attack method. We formulate the multi-hop reasoning chains starting from the query entity to the answer entity in the constructed graph, which allows us to align the question to each reasoning hop and thus attack any hop. We categorize the questions into different reasoning types and adversarially modify part of the question corresponding to the selected reasoning hop to generate the distracting sentence. We test our adversarial scheme on three QA models on HotpotQA dataset. The results demonstrate significant performance reduction on both answer and supporting facts prediction, verifying the effectiveness of our reasoning chain based attack method for multi-hop reasoning models and the vulnerability of them. Our adversarial re-training further improves the performance and robustness of these models.
翻译:近些年来,在具有挑战性的多op QA任务方面取得了令人印象深刻的进展。然而,这些QA模式在遇到输入文本的某些干扰时可能会失败,而对于多op 推理的可解释性仍然不确定。先前的对抗性攻击工作通常编辑整个问题句,这对测试基于实体的多op 多op 推理能力影响有限。在本文中,我们提议了一个基于多Hop 推理链的对抗性攻击方法。我们从查询实体到构建的图表中的答复实体,设计了多op 推理链推理链,使我们能够将问题与每一次推理抽调相匹配,从而攻击任何跳动。我们将问题分为不同的推理类型,对准性地修改与选定推理跳动生成引力句相应的部分问题。我们用三个基于HotpotQA数据集的QA模型测试我们的对抗性攻击性计划。结果显示,在回答和支持事实预测两方面都显著地降低了性能,核查我们基于逻辑攻击方法的多op 推理理学模型的有效性及其脆弱性。我们的对抗性再培训进一步提高了这些模型的性能和稳健性。