Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. However, they are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. The problem of overconfidence becomes especially apparent in cases where the test-time data distribution differs from that which was seen during training. We propose a solution to this problem by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels. Our method results in a better calibrated network and is agnostic to the underlying model structure, so it can be applied to any neural network which produces a probability density as an output. We demonstrate the effectiveness of our method and validate its performance on both classification and regression problems, applying it to recent probabilistic neural network models.
翻译:事实证明,神经网络作为通用功能近效者,从复杂的数据分布中学习是成功的,但它们往往过于自信,从而导致不准确和错误的概率预测。当测试时间数据分布与培训期间所见数据分布不同时,过度自信问题就变得特别明显。我们提出解决这个问题的办法,方法是找出模型不合理地过于自信的地物空间区域,并有条件地提高这些预测的变异性,使之达到先前的标签分布。我们的方法在更好的校准网络中的结果,对基本模型结构具有敏感性,因此,它可以适用于产生概率密度作为产出的任何神经网络。我们展示了我们的方法的有效性,并验证其在分类和回归问题上的绩效,将其应用于最近的概率神经网络模型。