Despite the recent success of deep neural networks in natural language processing, the extent to which they can demonstrate human-like generalization capacities for natural language understanding remains unclear. We explore this issue in the domain of natural language inference (NLI), focusing on the transitivity of inference relations, a fundamental property for systematically drawing inferences. A model capturing transitivity can compose basic inference patterns and draw new inferences. We introduce an analysis method using synthetic and naturalistic NLI datasets involving clause-embedding verbs to evaluate whether models can perform transitivity inferences composed of veridical inferences and arbitrary inference types. We find that current NLI models do not perform consistently well on transitivity inference tasks, suggesting that they lack the generalization capacity for drawing composite inferences from provided training examples. The data and code for our analysis are publicly available at https://github.com/verypluming/transitivity.
翻译:尽管在自然语言处理方面深层的神经网络最近取得了成功,但它们能够展示人性一般理解自然语言的能力的程度仍不清楚。我们在自然语言推断领域探讨这一问题,重点是推论关系的过渡性,这是系统推论的一个基本属性。捕捉中转性的模型可以形成基本的推论模式,并产生新的推论。我们采用了一种分析方法,使用合成和自然的NLI数据集,包括编造条款的动词来评价模型能否进行由推论和任意推论类型组成的过渡性推论。我们发现,目前的NLI模型在推论任务上没有始终很好地发挥作用,表明它们缺乏从所提供的培训实例中得出综合推论的概括性能力。我们分析的数据和代码可在https://github.com/verypluming/transiversity上公开查阅。我们分析的数据和代码可在https://github.com/verypluming/tractionivity上查阅。