Uncertainty quantification is at the core of the reliability and robustness of machine learning. It is well-known that uncertainty consists of two different types, often referred to as aleatoric and epistemic uncertainties. In this paper, we provide a systematic study on the epistemic uncertainty in deep supervised learning. We rigorously distinguish different sources of epistemic uncertainty, including in particular procedural variability (from the training procedure) and data variability (from the training data). We use our framework to explain how deep ensemble enhances prediction by reducing procedural variability. We also propose two approaches to estimate epistemic uncertainty for a well-trained neural network in practice. One uses influence function derived from the theory of neural tangent kernel that bypasses the convexity assumption violated by modern neural networks. Another uses batching that bypasses the time-consuming Gram matrix inversion in the influence function calculation, while expending minimal re-training effort. We discuss how both approaches overcome some difficulties in applying classical statistical methods to the inference on deep learning.
翻译:确定性量化是机器学习可靠性和稳健性的核心,众所周知,不确定性由两种不同类型的不同类型组成,通常称为偏向性和感知性不确定性。在本文中,我们系统地研究深层监督下学习的认知性不确定性。我们严格区分认知性不确定性的不同来源,特别包括程序变异(来自培训程序)和数据变异(来自培训数据)。我们使用我们的框架来解释深层次的共性如何通过减少程序变异来增强预测。我们还提出两种方法来估计受过良好训练的神经网络在实践中的共性不确定性。我们使用两种方法,一种是利用神经红心理论产生的影响功能,该理论绕过现代神经网络所违反的共性假设。另一种方法是在计算影响功能时绕过耗时间的格拉姆矩阵转换,同时用最少的再培训努力。我们讨论两种方法如何克服在应用典型统计方法来推断深层学习时遇到的一些困难。