Machine intelligence is increasingly being linked to claims about sentience, language processing, and an ability to comprehend and transform natural language into a range of stimuli. We systematically analyze the ability of DALL-E 2 to capture 8 grammatical phenomena pertaining to compositionality that are widely discussed in linguistics and pervasive in human language: binding principles and coreference, passives, structural ambiguity, negation, word order, double object constructions, sentence coordination, ellipsis, and comparatives. Whereas young children routinely master these phenomena, learning systematic mappings between syntax and semantics, DALL-E 2 is unable to reliably infer meanings that are consistent with the syntax. These results challenge recent claims concerning the capacity of such systems to understand of human language. We make available the full set of test materials as a benchmark for future testing.
翻译:我们系统地分析DALL-E 2 是否有能力捕捉与组成有关的八种语法现象,这些现象在语言中广泛讨论,在人文语言中普遍存在:约束性原则和参照、被动、结构模糊、否定、单词顺序、双重物体构造、句号协调、省略和比较。虽然幼儿经常掌握这些现象,学习在语法和语义学之间的系统绘图,但DALL-E 2 无法可靠地推断出与语法一致的含义。这些结果对最近关于这种系统理解人类语言的能力的说法提出了质疑。我们提供了一整套测试材料,作为未来测试的基准。