Identifying unfamiliar inputs, also known as out-of-distribution (OOD) detection, is a crucial property of any decision making process. A simple and empirically validated technique is based on deep ensembles where the variance of predictions over different neural networks acts as a substitute for input uncertainty. Nevertheless, a theoretical understanding of the inductive biases leading to the performance of deep ensemble's uncertainty estimation is missing. To improve our description of their behavior, we study deep ensembles with large layer widths operating in simplified linear training regimes, in which the functions trained with gradient descent can be described by the neural tangent kernel. We identify two sources of noise, each inducing a distinct inductive bias in the predictive variance at initialization. We further show theoretically and empirically that both noise sources affect the predictive variance of non-linear deep ensembles in toy models and realistic settings after training. Finally, we propose practical ways to eliminate part of these noise sources leading to significant changes and improved OOD detection in trained deep ensembles.
翻译:识别不熟悉的投入,又称分配外检测,是任何决策过程的关键特性。一个简单和经经验验证的技术基于深层集合,不同神经网络的预测差异可以替代输入不确定性。然而,缺乏对导致进行深合体不确定性估计的感应偏差的理论理解。为了改进我们对其行为的描述,我们研究在简化线性培训制度下操作的高层宽度的深层集合,其中神经正核可以描述经过梯度下降训练的功能。我们找出两种噪音来源,每种来源在初始化时的预测差异中都产生明显的感性偏差。我们进一步从理论上和实验上表明,两种噪音来源都影响非线性深层集合在玩具模型中的预测差异和经过培训后的现实环境。最后,我们提出切实可行的办法,消除部分这些噪音源,导致重大变化,改进经过训练的深层聚合物中的OOD探测。