矢量基础问题 (The Vector Grounding Problem)

The remarkable performance of large language models (LLMs) on complex linguistic tasks has sparked a lively debate on the nature of their capabilities. Unlike humans, these models learn language exclusively from textual data, without direct interaction with the real world. Nevertheless, they can generate seemingly meaningful text about a wide range of topics. This impressive accomplishment has rekindled interest in the classical 'Symbol Grounding Problem,' which questioned whether the internal representations and outputs of classical symbolic AI systems could possess intrinsic meaning. Unlike these systems, modern LLMs are artificial neural networks that compute over vectors rather than symbols. However, an analogous problem arises for such systems, which we dub the Vector Grounding Problem. This paper has two primary objectives. First, we differentiate various ways in which internal representations can be grounded in biological or artificial systems, identifying five distinct notions discussed in the literature: referential, sensorimotor, relational, communicative, and epistemic grounding. Unfortunately, these notions of grounding are often conflated. We clarify the differences between them, and argue that referential grounding is the one that lies at the heart of the Vector Grounding Problem. Second, drawing on theories of representational content in philosophy and cognitive science, we propose that certain LLMs, particularly those fine-tuned with Reinforcement Learning from Human Feedback (RLHF), possess the necessary features to overcome the Vector Grounding Problem, as they stand in the requisite causal-historical relations to the world that underpin intrinsic meaning. We also argue that, perhaps unexpectedly, multimodality and embodiment are neither necessary nor sufficient conditions for referential grounding in artificial systems.

翻译：大型语言模型 (LLM) 在复杂的语言任务上表现出色，引发了关于其能力本质的热烈讨论。与人类不同，这些模型仅从文本数据学习语言，没有直接与现实世界交互。然而，它们可以生成关于各种主题的看似有意义的文本。这一印象深刻的成就重新引发了人们对于经典的“符号基础问题”的兴趣，该问题质疑了经典符号人工智能系统的内部表示和输出是否能够拥有本质意义。与这些系统不同，现代 LLM 是计算向量而不是符号的人工神经网络。然而，类似的问题也出现在这些系统中，我们将其称为“矢量基础问题”。本文有两个主要目标。首先，我们区分了内部表示可以在生物或人工系统中得到基础的不同方式，并确定了文献中讨论的五种不同概念：指称性，感觉运动，关系，交流和认识基础。不幸的是，这些基础概念经常混淆。我们澄清了它们之间的区别，并认为指称性基础是矢量基础问题的核心。其次，借鉴哲学和认知科学中有关表征内容的理论，我们提出特定的 LLM（尤其是那些从 RLHF (Reinforcement Learning from Human Feedback) 中微调的模型）具有克服矢量基础问题所必需的特征，因为它们处于支撑内在意义的世界下所必需的因果历史关系之中。我们还认为，也许出人意料的是，在人工系统中，多模态和具体体现既不是指称性基础的必要条件也不是充分条件。