Measuring the in-context computational effort of language models is a key challenge, as metrics like next-token loss fail to capture reasoning complexity. Prior methods based on latent state compressibility can be invasive and unstable. We propose Multiple Token Divergence (MTD), a simple measure of computational effort defined as the KL divergence between a model's full output distribution and that of a shallow, auxiliary prediction head. MTD can be computed directly from pre-trained models with multiple prediction heads, requiring no additional training. Building on this, we introduce Divergence Steering, a novel decoding method to control the computational character of generated text. We empirically show that MTD is more effective than prior methods at distinguishing complex tasks from simple ones. On mathematical reasoning benchmarks, MTD correlates positively with problem difficulty. Lower MTD is associated with more accurate reasoning. MTD provides a practical, lightweight tool for analyzing and steering the computational dynamics of language models.
翻译:测量语言模型的上下文计算量是一个关键挑战,因为诸如下一令牌损失等指标无法捕捉推理复杂度。基于潜在状态可压缩性的现有方法可能具有侵入性且不稳定。我们提出多令牌散度(MTD),这是一种简单的计算量度量,定义为模型完整输出分布与浅层辅助预测头输出分布之间的KL散度。MTD可直接从具有多个预测头的预训练模型计算,无需额外训练。在此基础上,我们引入散度引导,一种新颖的解码方法,用于控制生成文本的计算特性。我们通过实验证明,MTD在区分复杂任务与简单任务方面比现有方法更有效。在数学推理基准测试中,MTD与问题难度呈正相关。较低的MTD与更准确的推理相关。MTD为分析和引导语言模型的计算动态提供了一个实用、轻量级的工具。