The computational cost of usual Monte Carlo methods for sampling a posteriori laws in Bayesian inference scales linearly with the number of data points. One option to reduce it to a fraction of this cost is to resort to mini-batching in conjunction with unadjusted discretizations of Langevin dynamics, in which case only a random fraction of the data is used to estimate the gradient. However, this leads to an additional noise in the dynamics and hence a bias on the invariant measure which is sampled by the Markov chain. We advocate using the so-called Adaptive Langevin dynamics, which is a modification of standard inertial Langevin dynamics with a dynamical friction which automatically corrects for the increased noise arising from mini-batching. We investigate the practical relevance of the assumptions underpinning Adaptive Langevin (constant covariance for the estimation of the gradient), which are not satisfied in typical models of Bayesian inference; and show how to extend the approach to more general situations.
翻译:在Bayesian 二次推论尺度中,通常的Monte Carlo 方法的测算成本与数据点数线性测算法的测算成本。将其降低到这一成本的一小部分的一个选择是,在Langevin动态的未经调整的离散化中,采用微型比对法,在这种情况下,只使用数据随机的一小部分来估计梯度。然而,这导致动态增加噪音,从而在由Markov 链抽样的惯性测量上产生偏差。我们主张使用所谓的适应性兰尼文动态,即改变标准的惯性兰尼文动态,并进行动态摩擦,自动纠正因小型打乱而增加的噪音。我们调查调整性朗埃文(对梯度估计的一致共变)所依据的假设的实际相关性,这些假设在典型的Bayesian 推断模型中并不满足,并表明如何将这一方法扩大到更一般的情况。