We introduce the Boltzmann-Shannon Index (BSI), a normalized measure for clustered continuous data that captures the interaction between frequency-based and geometry-based probability distributions. Building on ideas from geometric coarse-graining and information theory, the BSI quantifies how well a partition reflects both the population of each cluster and its effective geometric extent. We illustrate its behavior on synthetic Gaussian mixtures, the Iris benchmark, and a high-imbalance resource-allocation scenario, showing that the index provides a coherent assessment even when traditional metrics give incomplete or misleading signals. Moreover, in resource-allocation settings, we demonstrate that BSI not only detects severe density-geometry inconsistency with high sensitivity, but also offers a smooth, optimization-ready objective that naturally favors allocations balancing demographic weight with each group's effective spread in the outcome space, while providing a smooth, gradient-friendly regularizer that can be easily embedded in modern policy-making and algorithmic governance optimization frameworks.
翻译:本文提出玻尔兹曼-香农指数(BSI),一种针对聚类连续数据的归一化度量指标,用于捕捉基于频率的概率分布与基于几何的概率分布之间的相互作用。基于几何粗粒化与信息论的思想,BSI量化了数据划分在反映各簇样本数量及其有效几何范围方面的表现。我们通过合成高斯混合模型、Iris基准数据集以及高不平衡资源分配场景展示了该指数的特性,结果表明即使在传统指标给出不完整或误导性信号时,该指数仍能提供一致的评估。此外,在资源分配场景中,我们证明BSI不仅能够以高灵敏度检测严重的密度-几何不一致性,还提供了一个平滑、易于优化的目标函数,该函数天然倾向于平衡人口统计权重与各群体在结果空间中有效分布的分配方案,同时作为一个平滑、梯度友好的正则化项,可轻松嵌入现代政策制定与算法治理优化框架中。