Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text corpora. However, most pre-trained models neglect multilingual knowledge, which is language agnostic but comprises abundant cross-lingual structure alignment. In this paper, we propose XLM-K, a cross-lingual language model incorporating multilingual knowledge in pre-training. XLM-K augments existing multilingual pre-training with two knowledge tasks, namely Masked Entity Prediction Task and Object Entailment Task. We evaluate XLM-K on MLQA, NER and XNLI. Experimental results clearly demonstrate significant improvements over existing multilingual language models. The results on MLQA and NER exhibit the superiority of XLM-K in knowledge related tasks. The success in XNLI shows a better cross-lingual transferability obtained in XLM-K. What is more, we provide a detailed probing analysis to confirm the desired knowledge captured in our pre-training regimen.
翻译:跨语言培训前使用单一语言和双语纯文本公司取得了巨大成功,然而,大多数经过培训的模型忽视了多种语言知识,即语言不可知性,但包含大量跨语言结构的校准。在本文件中,我们提议采用跨语言语言模式XLM-K,这是一个在培训前纳入多语言知识的跨语言语言语言模式。XLM-K通过两项知识任务,即蒙面实体预测任务和目标匹配任务,扩大现有的多语言培训前任务。我们评估了MLQA、NER和XNLI的XLM-K。实验结果清楚地表明了现有多种语言模式的重大改进。 MLQA和NER的成果显示了XLM-K在与知识有关的任务中的优势。 XLM-K的成功显示了在XLM-K中取得的更好的跨语言转让能力。此外,我们提供了详细的预测分析,以证实我们培训前制度所捕捉到的所需知识。