Question Answering (QA) is the task of automatically answering questions posed by humans in natural languages. There are different settings to answer a question, such as abstractive, extractive, boolean, and multiple-choice QA. As a popular topic in natural language processing tasks, extractive question answering task (extractive QA) has gained extensive attention in the past few years. With the continuous evolvement of the world, generalized cross-lingual transfer (G-XLT), where question and answer context are in different languages, poses some unique challenges over cross-lingual transfer (XLT), where question and answer context are in the same language. With the boost of corresponding development of related benchmarks, many works have been done to improve the performance of various language QA tasks. However, only a few works are dedicated to the G-XLT task. In this work, we propose a generalized cross-lingual transfer framework to enhance the model's ability to understand different languages. Specifically, we first assemble triples from different languages to form multilingual knowledge. Since the lack of knowledge between different languages greatly limits models' reasoning ability, we further design a knowledge injection strategy via leveraging link prediction techniques to enrich the model storage of multilingual knowledge. In this way, we can profoundly exploit rich semantic knowledge. Experiment results on real-world datasets MLQA demonstrate that the proposed method can improve the performance by a large margin, outperforming the baseline method by 13.18%/12.00% F1/EM on average.
翻译:问答(QA)是指自动回答人类自然语言提出的问题的任务。回答问题有不同的设置,如抽象、提取、布尔和多项选择QA。作为自然语言处理任务中的热门话题,提取式问答任务(extractive QA)在过去几年中受到了广泛关注。随着世界的不断演变,一般的跨语境传输(G-XLT),其中问题和答案上下文位于不同的语言中,对于跨语境传输(XLT)提出了一些独特的挑战,其中问题和答案上下文位于同一种语言中。随着相关基准的不断增强,许多工作已经被用于提高各种语言QA任务的性能。然而,仅有少数几项工作致力于G-XLT任务。在这项工作中,我们提出了一个通用的跨语境传输框架,以增强模型理解不同语言的能力。具体而言,我们首先从不同的语言中组装三元组,形成多语言知识。由于不同语言之间缺乏知识极大地限制了模型的推理能力,我们进一步设计了一种知识注入策略,通过利用链接预测技术来丰富多语言知识的模型存储。通过这种方式,我们可以深入挖掘丰富的语义知识。实验结果表明,所提出的方法可以大幅提高实际数据集MLQA的性能,平均比基线方法提高了13.18% / 12.00% F1 / EM。