OpenAI has recently argued that hallucinations in large language models result primarily from misaligned evaluation incentives that reward confident guessing rather than epistemic humility. On this view, hallucination is a contingent behavioral artifact, remediable through improved benchmarks and reward structures. In this paper, we challenge that interpretation. Drawing on previous work on structural hallucination and empirical experiments using a Licensing Oracle, we argue that hallucination is not an optimization failure but an architectural inevitability of the transformer model. Transformers do not represent the world; they model statistical associations among tokens. Their embedding spaces form a pseudo-ontology derived from linguistic co-occurrence rather than world-referential structure. At ontological boundary conditions - regions where training data is sparse or incoherent - the model necessarily interpolates fictional continuations in order to preserve coherence. No incentive mechanism can modify this structural dependence on pattern completion. Our empirical results demonstrate that hallucination can only be eliminated through external truth-validation and abstention modules, not through changes to incentives, prompting, or fine-tuning. The Licensing Oracle achieves perfect abstention precision across domains precisely because it supplies grounding that the transformer lacks. We conclude that hallucination is a structural property of generative architectures and that reliable AI requires hybrid systems that distinguish linguistic fluency from epistemic responsibility.
翻译:OpenAI近期提出,大型语言模型中的幻觉主要源于评估激励机制的错位,即奖励自信猜测而非认知谦逊。根据这一观点,幻觉是一种偶然的行为产物,可通过改进基准测试和奖励结构来修正。本文挑战了这一解释。基于先前关于结构性幻觉的研究以及使用许可预言机(Licensing Oracle)进行的实证实验,我们认为幻觉并非优化失败,而是Transformer模型架构的必然结果。Transformer并不表征世界;它们建模的是词元之间的统计关联。其嵌入空间形成了一种源自语言共现而非世界指称结构的伪本体论。在本体论边界条件——训练数据稀疏或不连贯的区域——模型必然通过虚构延续来保持连贯性。任何激励机制都无法改变这种对模式完成的结构性依赖。我们的实证结果表明,幻觉只能通过外部真实验证和弃权模块消除,而非通过改变激励、提示或微调。许可预言机之所以能在跨领域中实现完美的弃权精度,正是因为它提供了Transformer所缺乏的根基。我们得出结论:幻觉是生成式架构的结构性属性,可靠的AI需要能够区分语言流畅性与认知责任的混合系统。