Accurate uncertainty quantification is a major challenge in deep learning, as neural networks can make overconfident errors and assign high confidence predictions to out-of-distribution (OOD) inputs. The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles. However their practicality in real-time, industrial-scale applications are limited due to the high memory and computational cost. Furthermore, ensembles and BNNs do not necessarily fix all the issues with the underlying member networks. In this work, we study principled approaches to improve uncertainty property of a single network, based on a single, deterministic representation. By formalizing the uncertainty quantification as a minimax learning problem, we first identify distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs with two simple changes: (1) applying spectral normalization to hidden weights to enforce bi-Lipschitz smoothness in representations and (2) replacing the last output layer with a Gaussian process layer. On a suite of vision and language understanding benchmarks, SNGP outperforms other single-model approaches in prediction, calibration and out-of-domain detection. Furthermore, SNGP provides complementary benefits to popular techniques such as deep ensembles and data augmentation, making it a simple and scalable building block for probabilistic deep learning. Code is open-sourced at https://github.com/google/uncertainty-baselines
翻译:准确的不确定性量化是深层学习的一大挑战,因为神经网络可以做出过于自信的错误,并且给分配之外的输入给出高度信心预测。在深层学习中最受欢迎的预测不确定性的方法是结合多种神经网络的预测的方法,例如巴伊西亚神经网络(BNNs)和深度组合。但在实时、工业规模应用中,由于记忆和计算成本高,其实用性有限。此外,神经网络和BNNNs不一定能用基础成员网络解决所有问题。在这项工作中,我们研究原则性方法来改进单一网络的不确定性属性。通过将不确定性量化正规化为小型学习问题,我们首先确定远程意识,即模型能够量化测试示例与培训数据之间的距离,作为DNNNNU(ial)的公开性(e.maxyal-maximal)方法,用于高质量(emaximalimalalalal)方法,我们然后建议Spregial-dealalalalal-dealation a sqregal disal roupal roupal roup roupal roupal roupal roups), 我们首先,我们首先确定Sil six six 将Sildismal dismocial 和Sildal dismal dismal dismal dismal dismal dismal disal roal roal roismal roal sal roskmalmental sal sal roismal silmal sal 方法,即提供一种简单的Smal rocal sal rogismal salmentalmentalmental rocal rocal rod sal sald smal rocal sal rocal rocal rod sal rocal rod rod roal rocalmaldal roal roal roal roal roal roal roal rod sal ro roal rodal rodal rodal 。,也就是,即算算算算算算算算算算算算算算算算算算算算