Metamorphic testing has recently been used to check the safety of neural NLP models. Its main advantage is that it does not rely on a ground truth to generate test cases. However, existing studies are mostly concerned with robustness-like metamorphic relations, limiting the scope of linguistic properties they can test. We propose three new classes of metamorphic relations, which address the properties of systematicity, compositionality and transitivity. Unlike robustness, our relations are defined over multiple source inputs, thus increasing the number of test cases that we can produce by a polynomial factor. With them, we test the internal consistency of state-of-the-art NLP models, and show that they do not always behave according to their expected linguistic properties. Lastly, we introduce a novel graphical notation that efficiently summarises the inner structure of metamorphic relations.
翻译:最近利用变形测试来检查神经NLP模型的安全性,其主要优势在于它不依赖地面真理来生成测试案例。然而,现有研究大多与稳健性相似的变形关系有关,限制了它们可以测试的语言特性的范围。我们提出了三个新的变形关系类别,处理系统性、组成性和过渡性等特性。与强健性不同,我们的关系是由多种源投入来定义的,从而增加了我们可以通过多元系数生成的测试案例的数量。我们用它们来测试最先进的NLP模型的内部一致性,并表明它们并非总能按照预期的语言特性行事。最后,我们引入了一个新的图形符号,有效地组合了变形关系的内部结构。