语言模型决策过程的几何结构 (Geometry of Decision Making in Language Models)

Large Language Models (LLMs) show strong generalization across diverse tasks, yet the internal decision-making processes behind their predictions remain opaque. In this work, we study the geometry of hidden representations in LLMs through the lens of \textit{intrinsic dimension} (ID), focusing specifically on decision-making dynamics in a multiple-choice question answering (MCQA) setting. We perform a large-scale study, with 28 open-weight transformer models and estimate ID across layers using multiple estimators, while also quantifying per-layer performance on MCQA tasks. Our findings reveal a consistent ID pattern across models: early layers operate on low-dimensional manifolds, middle layers expand this space, and later layers compress it again, converging to decision-relevant representations. Together, these results suggest LLMs implicitly learn to project linguistic inputs onto structured, low-dimensional manifolds aligned with task-specific decisions, providing new geometric insights into how generalization and reasoning emerge in language models.

翻译：大型语言模型（LLMs）在多样化任务中展现出强大的泛化能力，但其预测背后的内部决策过程仍不透明。本研究通过\textit{本征维度}（ID）的视角探究LLMs隐藏表征的几何特性，重点关注多项选择题问答（MCQA）场景中的决策动态。我们开展了大规模实验，基于28个开源权重的Transformer模型，使用多种估计器逐层计算ID，并量化各层在MCQA任务上的性能。研究发现所有模型均呈现一致的ID变化规律：早期层在低维流形上运算，中间层扩展该空间，后期层再次压缩空间并收敛至与决策相关的表征。这些结果表明，LLMs能够隐式地将语言输入投影到与任务决策对齐的结构化低维流形上，为理解语言模型中泛化与推理能力的涌现机制提供了新的几何视角。