Confidence calibration is of great importance to the reliability of decisions made by machine learning systems. However, discriminative classifiers based on deep neural networks are often criticized for producing overconfident predictions that fail to reflect the true correctness likelihood of classification accuracy. We argue that such an inability to model uncertainty is mainly caused by the closed-world nature in softmax: a model trained by the cross-entropy loss will be forced to classify input into one of $K$ pre-defined categories with high probability. To address this problem, we for the first time propose a novel $K$+1-way softmax formulation, which incorporates the modeling of open-world uncertainty as the extra dimension. To unify the learning of the original $K$-way classification task and the extra dimension that models uncertainty, we propose a novel energy-based objective function, and moreover, theoretically prove that optimizing such an objective essentially forces the extra dimension to capture the marginal data distribution. Extensive experiments show that our approach, Energy-based Open-World Softmax (EOW-Softmax), is superior to existing state-of-the-art methods in improving confidence calibration.
翻译:信任度的校准对于机器学习系统所作决定的可靠性非常重要。然而,基于深神经网络的歧视性分类者往往因不反映分类准确性的真正正确性可能性而受到批评,因为作出过于自信的预测而不能反映分类准确性的真实性。我们争辩说,这种无法模拟不确定性的主要原因是封闭世界的软体性质:由跨热带损失所训练的模型将被迫将输入划入一个高概率的预先界定的K美元类别。为了解决这一问题,我们第一次提议了一个新的以K$+1way软体配方,将开放世界不确定性的模型作为额外维度。为了统一对原始的K$way分类任务和模型不确定性的额外维度的学习,我们提出了一个新的基于能源的目标功能,此外,从理论上证明优化这样一个目标将从根本上迫使额外层面来捕捉边际数据分布。广泛的实验表明,我们的方法,即基于能源的开放世界软体(EOW-Softmax),比改进信任校准中现有的状态方法优越。