The recent advent of large language models has reinvigorated debate over whether human cognitive capacities might emerge in such generic models given sufficient training data. Of particular interest is the ability of these models to reason about novel problems zero-shot, without any direct training. In human cognition, this capacity is closely tied to an ability to reason by analogy. Here, we performed a direct comparison between human reasoners and a large language model (the text-davinci-003 variant of GPT-3) on a range of analogical tasks, including a novel text-based matrix reasoning task closely modeled on Raven's Progressive Matrices. We found that GPT-3 displayed a surprisingly strong capacity for abstract pattern induction, matching or even surpassing human capabilities in most settings. Our results indicate that large language models such as GPT-3 have acquired an emergent ability to find zero-shot solutions to a broad range of analogy problems.
翻译:最近大型语言模型的出现重新引发了争论,即在足够的训练数据下,人类认知能力是否可能出现在这些通用模型中。其中特别有趣的是这些模型能否无需任何直接训练,即能够零样本推理,解决关于新问题的推理。 在人类认知中,这种能力与通过类比推理进行推理的能力密切相关。 在本文中,我们在一系列类比任务上直接比较了人类推理者和一个大型语言模型(GPT-3的text-davinci-003变体),包括一个紧密模拟Raven's Progressive Matrices的基于文本的矩阵推理任务。 我们发现,GPT-3展现出了出人意料的强大的抽象模式归纳能力,在大多数情况下能够与甚至超过人类能力。 我们的结果表明,GPT-3这样的大型语言模型已经获得了解决广泛类比问题的零样本解决方案的新型能力。