As neural network classifiers are deployed in real-world applications, it is crucial that their failures can be detected reliably. One practical solution is to assign confidence scores to each prediction, then use these scores to filter out possible misclassifications. However, existing confidence metrics are not yet sufficiently reliable for this role. This paper presents a new framework that produces a quantitative metric for detecting misclassification errors. This framework, RED, builds an error detector on top of the base classifier and estimates uncertainty of the detection scores using Gaussian Processes. Experimental comparisons with other error detection methods on 125 UCI datasets demonstrate that this approach is effective. Further implementations on two probabilistic base classifiers and two large deep learning architecture in vision tasks further confirm that the method is robust and scalable. Third, an empirical analysis of RED with out-of-distribution and adversarial samples shows that the method can be used not only to detect errors but also to understand where they come from. RED can thereby be used to improve trustworthiness of neural network classifiers more broadly in the future.
翻译:由于神经网络分类器被部署在现实世界的应用中,因此关键在于能否可靠地检测出它们的故障。一个实际的解决办法是给每个预测分配信任分数,然后用这些分数来过滤可能的错误分类。然而,现有的信心度量对于这一作用来说还不够可靠。本文提出了一个新的框架,为检测错误分类错误提供了定量度量。这个框架,RED,在基级分类器之上建立一个错误检测器,并用高森进程估算探测分数的不确定性。在125 UCI数据集中与其他错误检测方法的实验性比较表明,这一方法是有效的。进一步实施两个概率基分解器和两个大型的视觉任务深层学习结构进一步证实,该方法是稳健和可缩放的。第三,对RED的外部和对抗性样本进行的经验分析表明,该方法不仅可用于检测错误,而且可用于了解它们来自何处。因此,RED可以用来提高神经网络分类器在未来更为广泛范围内的可信度。