Deep neural networks (DNNs) have shown to perform very well on large scale object recognition problems and lead to widespread use for real-world applications, including situations where DNN are implemented as "black boxes". A promising approach to secure their use is to accept decisions that are likely to be correct while discarding the others. In this work, we propose DOCTOR, a simple method that aims to identify whether the prediction of a DNN classifier should (or should not) be trusted so that, consequently, it would be possible to accept it or to reject it. Two scenarios are investigated: Totally Black Box (TBB) where only the soft-predictions are available and Partially Black Box (PBB) where gradient-propagation to perform input pre-processing is allowed. Empirically, we show that DOCTOR outperforms all state-of-the-art methods on various well-known images and sentiment analysis datasets. In particular, we observe a reduction of up to $4\%$ of the false rejection rate (FRR) in the PBB scenario. DOCTOR can be applied to any pre-trained model, it does not require prior information about the underlying dataset and is as simple as the simplest available methods in the literature.
翻译:深心神经网络(DNN)在大规模物体识别问题上表现非常出色,并导致在现实世界应用中广泛使用,包括DNN作为“黑盒”实施的情况。一个很有希望的方法是接受可能正确的决定,同时抛弃其他决定。在这项工作中,我们建议DOCtor(DOCTOR),这是一个简单的方法,旨在确定对DNN分类器的预测是否应当(或不应该)可信,从而有可能因此接受或拒绝它。调查了两种情况:完全黑盒(TBB),只有软口才有,部分黑盒(PBBB),允许进行梯度调整以进行输入预处理。我们随机地表明,DOCTOR在各种广为人知的图像和情绪分析数据集中超越了所有最先进的方法。我们特别注意到,在PBB的设想中,错误拒绝率降低到4 ⁇ 美元。DOCtor(FRR)可以应用到任何简单的模型中,而以前可以使用的是简单的模型,因此以前不需要简单的数据。