Bayesian inference allows to obtain useful information on the parameters of models, either in computational statistics or more recently in the context of Bayesian Neural Networks. The computational cost of usual Monte Carlo methods for sampling posterior laws in Bayesian inference scales linearly with the number of data points. One option to reduce it to a fraction of this cost is to resort to mini-batching in conjunction with unadjusted discretizations of Langevin dynamics, in which case only a random fraction of the data is used to estimate the gradient. However, this leads to an additional noise in the dynamics and hence a bias on the invariant measure which is sampled by the Markov chain. We advocate using the so-called Adaptive Langevin dynamics, which is a modification of standard inertial Langevin dynamics with a dynamical friction which automatically corrects for the increased noise arising from mini-batching. We investigate the practical relevance of the assumptions underpinning Adaptive Langevin (constant covariance for the estimation of the gradient, Gaussian minibatching noise), which are not satisfied in typical models of Bayesian inference, and quantify the bias induced by minibatching in this case. We also suggest a possible extension of AdL to further reduce the bias on the posterior distribution, by considering a dynamical friction depending on the current value of the parameter to sample.
翻译:贝叶斯推论能够获取关于模型参数的有用信息,无论是在计算统计中还是在最近巴伊西亚神经网络中,这些模型参数的参数都是计算统计或最近在巴伊西亚神经网络中。 通常的蒙特卡洛方法对巴伊西亚推论尺度的海边法律取样的计算成本,与数据点数成线。 将其降低到这一成本的一小部分的一个选择是,与未调整的朗埃文动态分解相结合,采用小型分离方法进行小型分离,在这种情况下,只使用随机数据的一部分来估计梯度。 然而,这导致动态中出现更多的噪音,从而对由马尔科夫链抽样的不变化计量产生偏差。 我们主张使用所谓的 " 调度朗埃文 " 方法的计算成本,即用动态摩擦来改变标准的惯性兰格文动态动态动态,从而自动纠正因小搏击而增加的噪音。 我们调查了 " 适应性朗埃文 " 模型的实际相关性(用于估计梯度、高斯略微调噪声)的假设。 在典型的巴伊尔夫山脉模型中无法满足这种典型的变差分布上,我们也认为,从模拟的偏差分析了可能的变判。</s>