This paper proposes a fast and scalable method for uncertainty quantification of machine learning models' predictions. First, we show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. Importantly, the proposed approach allows to disentangle explicitly aleatoric and epistemic uncertainties. The resulting method works directly in the feature space. However, one can apply it to any neural network by considering an embedding of the data induced by the network. We demonstrate the strong performance of the method in uncertainty estimation tasks on text classification problems and a variety of real-world image datasets, such as MNIST, SVHN, CIFAR-100 and several versions of ImageNet.
翻译:本文提出了一个快速和可扩展的方法,用于对机器学习模型预测的不确定性进行量化。首先,我们展示了测量基于Nadaraya-Watson对有条件标签分布的非参数性估计的分类器预测的不确定性的原则方法。重要的是,拟议方法可以将明显偏差和隐化的不确定性分解开来,由此得出的方法在地物空间中直接发挥作用。然而,通过考虑嵌入由网络引发的数据,可以将这种方法应用于任何神经网络。我们展示了在文本分类问题不确定性估算任务以及多种真实世界图像数据集方面的有力表现,如MNIST、SVHN、CIFAR-100和若干版本的图像网络。