Bayesian neural network inference is often carried out using stochastic gradient sampling methods. For best performance the methods should use a Riemannian metric that improves posterior exploration by accounting for the local curvature, but the existing methods resort to simple diagonal metrics to remain computationally efficient. This loses some of the gains. We propose two non-diagonal metrics that can be used in stochastic samplers to improve convergence and exploration but that have only a minor computational overhead over diagonal metrics. We show that for neural networks with complex posteriors, caused e.g. by use of sparsity-inducing priors, using these metrics provides clear improvements. For some other choices the posterior is sufficiently easy also for the simpler metrics.
翻译:贝叶斯神经网络的推论往往使用随机梯度采样方法进行。 最佳表现是,方法应使用里曼尼测量法,通过对当地曲线进行核算,改进后方勘探,但现有方法采用简单的对角测量法,以保持计算效率。这损失了部分收益。 我们提议了两种非对角测量法,可用于随机采样器,以改善趋同和勘探,但仅对二角测量法进行少量计算。 我们显示,对于使用光学引线等复杂后方测量线的神经网络,使用这些测量法可以提供明显的改进。 对于其他一些选择,对更简单的测量法来说,后方测量法也相当容易。</s>