The recent years have seen a revival of interest in textual entailment, sparked by i) the emergence of powerful deep neural network learners for natural language processing and ii) the timely development of large-scale evaluation datasets such as SNLI. Recast as natural language inference, the problem now amounts to detecting the relation between pairs of statements: they either contradict or entail one another, or they are mutually neutral. Current research in natural language inference is effectively exclusive to English. In this paper, we propose to advance the research in SNLI-style natural language inference toward multilingual evaluation. To that end, we provide test data for four major languages: Arabic, French, Spanish, and Russian. We experiment with a set of baselines. Our systems are based on cross-lingual word embeddings and machine translation. While our best system scores an average accuracy of just over 75%, we focus largely on enabling further research in multilingual inference.
翻译:近些年来,人们重新对文字要求感兴趣,原因包括:(一) 自然语言处理出现了强大的深层神经网络学习者,(二) 及时开发大规模评价数据集,如SNLI。作为自然语言推断,现在的问题是检测对两种声明之间的关系:它们相互矛盾或相互影响,或相互中立。目前对自然语言推断的研究实际上只局限于英语。在本文中,我们提议推进对SNLI式自然语言推论的研究,以进行多语种评估。为此,我们提供四种主要语言的测试数据:阿拉伯语、法语、西班牙语和俄语。我们用一套基线进行实验。我们的系统以跨语言嵌入和机器翻译为基础。我们的最佳系统的平均精度略超过75%,但我们主要侧重于进一步研究多语种推论。