Previous works show that deep NLP models are not always conceptually sound: they do not always learn the correct linguistic concepts. Specifically, they can be insensitive to word order. In order to systematically evaluate models for their conceptual soundness with respect to word order, we introduce a new explanation method for sequential data: Order-sensitive Shapley Values (OSV). We conduct an extensive empirical evaluation to validate the method and surface how well various deep NLP models learn word order. Using synthetic data, we first show that OSV is more faithful in explaining model behavior than gradient-based methods. Second, applying to the HANS dataset, we discover that the BERT-based NLI model uses only the word occurrences without word orders. Although simple data augmentation improves accuracy on HANS, OSV shows that the augmented model does not fundamentally improve the model's learning of order. Third, we discover that not all sentiment analysis models learn negation properly: some fail to capture the correct syntax of the negation construct. Finally, we show that pretrained language models such as BERT may rely on the absolute positions of subject words to learn long-range Subject-Verb Agreement. With each NLP task, we also demonstrate how OSV can be leveraged to generate adversarial examples.
翻译:先前的作品显示,深 NLP 模型在概念上并不总是很健全: 它们并不总是能学习正确的语言概念。 具体地说, 它们可能对单词顺序不敏感。 为了系统评估其字顺序概念正确性模型, 我们为顺序数据引入了一种新的解释方法: 秩序敏感的沙皮值( OSV ) 。 我们进行了广泛的实验性评估, 以验证方法和表层不同深层 NLP 模型学会字型顺序。 使用合成数据, 我们首先显示 OSV 在解释模型行为方面比基于梯度的方法更忠实。 其次, 应用 HANNS 数据集, 我们发现基于 BERT 的 NLI 模型只使用没有单词顺序的字数发生。 尽管简单的数据增强提高了 HANS 的准确性, OS VV 表明, 增强型模型并没有从根本上改进模型的秩序学习。 第三, 我们发现并非所有的情绪分析模型都能正确地学会否定。 一些没有捕捉到否定结构的正确语法。 最后, 我们显示, 像 NERT 这样的预先语言模型可以依赖绝对的字根基位置位置来学习远程的单词以学习远程端点操作V 。