大语言模型作为类比推理模型 (LLMs as Models for Analogical Reasoning)

from arxiv, The title has been changed from Semantic Structure-Mapping in LLM and Human Analogical Reasoning to LLMs as Models for Analogical Reasoning to improve clarity and accuracy

Analogical reasoning -- the capacity to identify and map structural relationships between different domains -- is fundamental to human cognition and learning. Recent studies have shown that large language models (LLMs) can sometimes match humans in analogical reasoning tasks, opening the possibility that analogical reasoning might emerge from domain-general processes. However, it is still debated whether these emergent capacities are largely superficial and limited to simple relations seen during training or whether they encompass the flexible representational and mapping capabilities which are the focus of leading cognitive models of analogy. In this study, we introduce novel analogical reasoning tasks that require participants to map between semantically contentful words and sequences of letters and other abstract characters. This task necessitates the ability to flexibly re-represent rich semantic information -- an ability which is known to be central to human analogy but which is thus far not well captured by existing cognitive theories and models. We assess the performance of both human participants and LLMs on tasks focusing on reasoning from semantic structure and semantic content, introducing variations that test the robustness of their analogical inferences. Advanced LLMs match human performance across several conditions, though humans and LLMs respond differently to certain task variations and semantic distractors. Our results thus provide new evidence that LLMs might offer a how-possibly explanation of human analogical reasoning in contexts that are not yet well modeled by existing theories, but that even today's best models are unlikely to yield how-actually explanations.

翻译：类比推理——即识别并映射不同领域之间结构关系的能力——是人类认知与学习的核心能力。近期研究表明，大型语言模型（LLMs）在某些类比推理任务中能够达到与人类相当的水平，这暗示类比推理可能源于领域通用的处理过程。然而，这些涌现的能力究竟是表面性的、仅限于训练中见过的简单关系，还是涵盖了灵活的表征与映射能力——后者正是主流类比认知模型的核心焦点——目前仍存在争议。本研究引入了一种新颖的类比推理任务，要求参与者将具有语义内容的词汇与字母序列及其他抽象字符进行映射。该任务需要灵活地重新表征丰富的语义信息，这种能力被认为是人类类比推理的关键，但迄今尚未被现有认知理论与模型充分捕捉。我们评估了人类参与者和LLMs在基于语义结构与语义内容的推理任务上的表现，并通过引入变体测试了其类比推理的稳健性。先进的LLMs在多种条件下达到了与人类相当的水平，但人类与LLMs对某些任务变体和语义干扰项的反应存在差异。因此，我们的研究结果提供了新的证据，表明LLMs可能为现有理论尚未充分建模的语境中的人类类比推理提供一种“可能如何”的解释，但即使当前最优秀的模型也不太可能给出“实际如何”的解释。

相关内容

类比推理

关注 0

类比推理亦称“类推”。推理的一种形式。根据两个对象在某些属性上相同或相似，通过比较而推断出它们在其他属性上也相同的推理过程。它是从观察个别现象开始的，因而近似归纳推理。但它又不是由特殊到一般，而是由特殊到特殊，因而又不同于归纳推理。分完全类推和不完全类推两种形式。完全类推是两个或两类事物在进行比较的方面完全相同时的类推；不完全类推是两个或两类事物在进行比较的方面不完全相同时的类推。

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日