Recent investigations into the inner-workings of state-of-the-art large-scale pre-trained Transformer-based Natural Language Understanding (NLU) models indicate that they appear to know humanlike syntax, at least to some extent. We provide novel evidence that complicates this claim: we find that state-of-the-art Natural Language Inference (NLI) models assign the same labels to permuted examples as they do to the original, i.e. they are largely invariant to random word-order permutations. This behavior notably differs from that of humans; we struggle with ungrammatical sentences. To measure the severity of this issue, we propose a suite of metrics and investigate which properties of particular permutations lead models to be word-order invariant. In the MNLI dataset, for example, we find almost all (98.7%) examples contain at least one permutation which elicits the gold label. Models are sometimes even able to assign gold labels to permutations that they originally failed to predict correctly. We provide a comprehensive empirical evaluation of this phenomenon, and further show that this issue exists for both Transformers and pre-Transformer RNN / ConvNet based encoders, as well as across multiple languages (English and Mandarin Chinese). Our code and data are available at https://github.com/facebookresearch/unlu.
翻译:最近对最新最先进的、经过事先培训的以自然语言理解(NLU)模型的内部工作调查显示,这些模型似乎至少在某种程度上了解人性的语法。我们提供了新的证据,使这一主张复杂化:我们发现,最先进的自然语言推断(NLI)模型将相同的标签分配给与原始模型相同的变异示例,即它们基本上不易随机的单词顺序调整。这种行为与人类的行为明显不同;我们努力使用非语法的句子。为衡量这一问题的严重性,我们提出一套计量标准,调查特定变异模式的哪些属性导致文字顺序。例如,在MNLI数据集中,我们发现几乎所有(98.7%)的例子至少包含一个诱导金标签的偏差。模型有时甚至能够指定黄金标签用于他们最初无法正确预测的变异。我们在Man-SARISB/NUR 版本中提供了该现象的全面实证性评估,并进一步显示,该变异性版本的版本是中国的版本/版本之前的数据。