Prevalent deep learning models suffer from significant over-confidence under distribution shifts. In this paper, we propose Density-Softmax, a single deterministic approach for uncertainty estimation via a combination of density function with the softmax layer. By using the latent representation's likelihood value, our approach produces more uncertain predictions when test samples are distant from the training samples. Theoretically, we prove that Density-Softmax is distance aware, which means its associated uncertainty metrics are monotonic functions of distance metrics. This has been shown to be a necessary condition for a neural network to produce high-quality uncertainty estimation. Empirically, our method enjoys similar computational efficiency as standard softmax on shifted CIFAR-10, CIFAR-100, and ImageNet dataset across modern deep learning architectures. Notably, Density-Softmax uses 4 times fewer parameters than Deep Ensembles and 6 times lower latency than Rank-1 Bayesian Neural Network, while obtaining competitive predictive performance and lower calibration errors under distribution shifts.
翻译:前方深层学习模型在分布式转换中遭遇了巨大的过度自信。 在本文中,我们提出“密度-软负”(Density-Softmax),这是一种单一的确定性方法,通过将密度函数与软负层结合来进行不确定性估计。通过使用潜在代表值,我们的方法在测试样本远离培训样本时得出了更不确定的预测。理论上,我们证明密度-软负(Density-Softmax)具有距离意识,这意味着其相关的不确定性指标是距离度量的单调函数。这已证明是神经网络产生高质量不确定性估计的一个必要条件。 时间性地说,我们的方法具有类似的计算效率,如移动的CIFAR-10、CIFAR-100和图像网络数据集在现代深层学习结构中的标准软负值。 很明显, 密度-软负(Density-Softmax)使用的参数比深层基因组少4倍,比级1波段神经网络低6倍,同时获得竞争性的预测性能和分布变化下的校准差差差。