During training, the weights of a Deep Neural Network (DNN) are optimized from a random initialization towards a nearly optimum value minimizing a loss function. Only this final state of the weights is typically kept for testing, while the wealth of information on the geometry of the weight space, accumulated over the descent towards the minimum is discarded. In this work we propose to make use of this knowledge and leverage it for computing the distributions of the weights of the DNN. This can be further used for estimating the epistemic uncertainty of the DNN by sampling an ensemble of networks from these distributions. To this end we introduce a method for tracking the trajectory of the weights during optimization, that does not require any changes in the architecture nor on the training procedure. We evaluate our method on standard classification and regression benchmarks, and on out-of-distribution detection for classification and semantic segmentation. We achieve competitive results, while preserving computational efficiency in comparison to other popular approaches.
翻译:在培训过程中,深神经网络的重量从随机初始化优化到接近最佳的最小损失功能。只有这一最后的重量状态通常用于测试,而关于重量空间几何的丰富信息(从下降到最小值之间积累)被丢弃。在这项工作中,我们提议利用这一知识并利用它来计算DNN的重量分布。这可以进一步用于通过从这些分布的网络中取样一组网络来估计DN的隐性不确定性。为此目的,我们引入了一种在优化期间跟踪重量轨迹的方法,这不需要对结构或培训程序作任何改动。我们评估了我们的标准分类和回归基准方法,以及用于分类和语义分化的分解的分流检测方法。我们取得了竞争性的结果,同时保持了与其他流行方法相比的计算效率。