深神经网络的不确定性量化:经验比较和使用准则 (Uncertainty Quantification for Deep Neural Networks: An Empirical Comparison and Usage Guidelines)

from arxiv, Accepted for publication at the Journal of Software: Testing, Verification and Reliability. arXiv admin note: substantial text overlap with arXiv:2102.00902

Deep Neural Networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, among which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms, which can provide only statistical warranties. Hence, developers have to cope with some residual error probability. An architectural pattern commonly adopted to manage failure-prone components is the supervisor, an additional component that can estimate the reliability of the predictions made by untrusted (e.g., DNN) components and can activate an automated healing procedure when these are likely to fail, ensuring that the Deep Learning based System (DLS) does not cause damages, despite its main functionality being suspended. In this paper, we consider DLS that implement a supervisor by means of uncertainty estimation. After overviewing the main approaches to uncertainty estimation and discussing their pros and cons, we motivate the need for a specific empirical assessment method that can deal with the experimental setting in which supervisors are used, where the accuracy of the DNN matters only as long as the supervisor lets the DLS continue to operate. Then we present a large empirical study conducted to compare the alternative approaches to uncertainty estimation. We distilled a set of guidelines for developers that are useful to incorporate a supervisor based on uncertainty monitoring into a DLS.

翻译：深心内网络(DNN)越来越多地被用作大型软件系统的组成部分,这些系统需要处理复杂的数据,如图像、书面文本、音频/视频信号等。 DNN的预测不能被认为总是正确,原因包括:处理的大量输入空间、某些输入数据的模糊性以及学习算法的内在性质,这些算法只能提供统计保证。因此,开发商必须应付一些剩余误差概率。通常用来管理易出错的易出错组成部分的建筑模式是主管,这是能够估计不可信(例如,DNNN)组成部分预测的可靠性的附加组成部分,并且能够在这种预测可能失败时启动自动愈合程序,确保基于深度学习系统的系统(DLS)不会造成损害,尽管其主要功能被中止。在本文件中,我们考虑DLS通过不确定性估计执行监督员。在概述不确定性估算的主要办法和讨论其实用的利弊之后,我们提出需要一种具体的实证评估方法,可以处理实验性环境的设置,而不受信任的(例如,DNNN)组成部分的预测是长期地将D监督员的准确性纳入对DNS的核查准则,然后,我们将DNNS的核查员的精确性研究。