Large Language Models (LLMs) frequently produce hallucinated statements that are assigned high likelihood by the model itself, exposing a fundamental limitation of probability-based verification. This suggests that hallucination is often not a low-confidence phenomenon, but a failure of structural consistency. In this work, we reformulate the verification of LLM reasoning as a Constraint Satisfaction Problem (CSP) operating independently of the generation likelihood. Rather than optimizing for statistical plausibility, we model verification as a feasibility check based on structural violation cost -- the computational cost required to embed a candidate reasoning step into the contextual graph structure. We define a total cost function composed of three proxies: (i) graph connectivity (structural), (ii) feature space consistency (geometric), and (iii) logical entailment (symbolic). Crucially, verification is performed via a lightweight System-2 gate, Eidoku, which rejects candidates exceeding a context-calibrated cost threshold. The threshold is not learned but is derived from the intrinsic statistics of the context, avoiding ad hoc heuristics. We demonstrate that this approach successfully rejects ``smooth falsehoods'' -- statements that are highly probable yet structurally disconnected -- that probability-based verifiers are principally incapable of detecting. Our experiments on a controlled diagnostic dataset show that explicitly enforcing structural constraints allows for the deterministic rejection of this specific class of hallucinations, serving as a neuro-symbolic sanity check for generative reasoning.
翻译:大型语言模型(LLM)经常生成被模型自身赋予高似然度的幻觉陈述,这暴露了基于概率的验证方法的根本局限性。这表明幻觉通常并非低置信度现象,而是结构一致性的失效。在本工作中,我们将LLM推理的验证重新表述为一个独立于生成似然度的约束满足问题(CSP)。与优化统计合理性不同,我们将验证建模为基于结构违反成本的可行性检查——该成本指将候选推理步骤嵌入上下文图结构所需的计算代价。我们定义了一个由三个代理指标构成的总成本函数:(i)图连通性(结构性),(ii)特征空间一致性(几何性),以及(iii)逻辑蕴涵(符号性)。关键的是,验证通过一个轻量级的System-2门控模块Eidoku执行,该模块会拒绝超过上下文校准成本阈值的候选结果。该阈值并非通过学习获得,而是源自上下文的内在统计特性,从而避免了临时启发式规则。我们证明该方法能成功拒绝“平滑谬误”——即那些具有高概率但在结构上脱节的陈述——这是基于概率的验证器本质上无法检测的。我们在受控诊断数据集上的实验表明,显式强制执行结构约束能够确定性地拒绝此类特定幻觉,为生成式推理提供了神经符号层面的合理性检验。