Variational inference in Bayesian neural networks is usually performed using stochastic sampling which gives very high-variance gradients, and hence slow learning. Here, we show that it is possible to obtain a deterministic approximation of the ELBO for a Bayesian neural network by doing a Taylor-series expansion around the mean of the current variational distribution. The resulting approximate ELBO is the training-log-likelihood plus a squared gradient regulariser. In addition to learning the approximate posterior variance, we also consider a uniform-variance approximate posterior, inspired by the stationary distribution of SGD. The corresponding approximate ELBO has a simple form, as the log-likelihood plus a simple squared-gradient regulariser. We argue that this squared-gradient regularisation may at the root of the excellent empirical performance of SGD.
翻译:Bayesian 神经网络中的变化性推断通常使用随机抽样方法进行,这种抽样方法给出了非常高的挥发性梯度,因此学习速度缓慢。在这里,我们表明,通过围绕当前变异分布的平均值进行泰勒系列扩展,有可能为Bayesian神经网络获得ELBO的确定性近似值。由此得出的ELBO的近似值是培训类比值加上平方梯度整流器。除了了解近似后部差异外,我们还考虑一种由SGD的固定分布所启发的统一性差近似后部。相应的ELBO的近似值是一种简单的形式,作为日志类加上简单的平方位整流器。我们说,这种正方位整齐的正规化可能是SGD出色经验性表现的根。