Recent work by \citet{hendrycks2025agidefinition} formalized \textit{Artificial General Intelligence} (AGI) as the arithmetic mean of proficiencies across cognitive domains derived from the Cattell--Horn--Carroll (CHC) model of human cognition. While elegant, this definition assumes \textit{compensability} -- that exceptional ability in some domains can offset failure in others. True general intelligence, however, should reflect \textit{coherent sufficiency}: balanced competence across all essential domains. We propose a coherence-aware measure of AGI based on the integral of generalized means over a continuum of compensability exponents. This formulation spans arithmetic, geometric, and harmonic regimes, and the resulting \textit{area under the curve} (AUC) quantifies robustness under varying compensability assumptions. Unlike the arithmetic mean, which rewards specialization, the AUC penalizes imbalance and captures inter-domain dependency. Applied to published CHC-based domain scores for GPT-4 and GPT-5, the coherence-adjusted AUC reveals that both systems remain far from general competence despite high arithmetic scores (e.g., GPT-5 at~24\%). Integrating the generalized mean thus yields a principled, interpretable, and stricter foundation for measuring genuine progress toward AGI.
翻译:近期\citet{hendrycks2025agidefinition}的工作将\textit{人工通用智能}(AGI)形式化为基于Cattell--Horn--Carroll(CHC)人类认知模型所衍生的各认知领域熟练度的算术平均值。该定义虽然简洁,但假设了\textit{可补偿性}——即某些领域的卓越能力可以抵消其他领域的不足。然而,真正的通用智能应体现\textit{一致充分性}:在所有关键领域均具备平衡的能力。我们提出一种基于一致性感知的AGI度量方法,该方法建立在可补偿性指数连续统上广义均值积分的基础上。该公式涵盖了算术、几何与调和均值体系,由此产生的\textit{曲线下面积}(AUC)量化了在不同可补偿性假设下的鲁棒性。与奖励专业化的算术均值不同,AUC会惩罚能力失衡并捕捉领域间的相互依赖性。将本方法应用于已发布的GPT-4和GPT-5基于CHC的领域得分时,一致性调整后的AUC显示,尽管两者算术得分较高(例如GPT-5约为24%),但距离通用能力仍相差甚远。因此,整合广义均值为衡量AGI的真正进展提供了一个原则性、可解释且更严格的基础。