Uncertainty quantification for deep neural networks has recently evolved through many techniques. In this work, we revisit Laplace approximation, a classical approach for posterior approximation that is computationally attractive. However, instead of computing the curvature matrix, we show that, under some regularity conditions, the Laplace approximation can be easily constructed using the gradient second moment. This quantity is already estimated by many exponential moving average variants of Adagrad such as Adam and RMSprop, but is traditionally discarded after training. We show that our method (L2M) does not require changes in models or optimization, can be implemented in a few lines of code to yield reasonable results, and it does not require any extra computational steps besides what is already being computed by optimizers, without introducing any new hyperparameter. We hope our method can open new research directions on using quantities already computed by optimizers for uncertainty estimation in deep neural networks.
翻译:深神经网络的不确定性量化最近通过许多技术演变而来。 在这项工作中,我们重新审视了Laplace近似(一种具有计算吸引力的经典后近似近似法),这是一种具有逻辑吸引力的经典方法。然而,我们不计算曲线矩阵,而是表明,在某些常规条件下,Laplace近近似可以很容易地使用梯度第二秒来构建。这个数量已经由Adagrad(如Adam和RMSpro)的许多指数移动平均变体来估算,但传统上在培训后被丢弃。我们显示,我们的方法(L2M)不需要改变模型或优化,可以在几行代码中实施,以产生合理的结果,而且除了优化者已经计算过的计算之外,也不要求任何额外的计算步骤,而不引入任何新的超参数。我们希望我们的方法能够打开新的研究方向,即使用优化者已经计算的数量来在深神经网络中进行不确定性估计。