Bayesian inference allows to obtain useful information on the parameters of models, either in computational statistics or more recently in the context of Bayesian Neural Networks. The computational cost of usual Monte Carlo methods for sampling a posteriori laws in Bayesian inference scales linearly with the number of data points. One option to reduce it to a fraction of this cost is to resort to mini-batching in conjunction with unadjusted discretizations of Langevin dynamics, in which case only a random fraction of the data is used to estimate the gradient. However, this leads to an additional noise in the dynamics and hence a bias on the invariant measure which is sampled by the Markov chain. We advocate using the so-called Adaptive Langevin dynamics, which is a modification of standard inertial Langevin dynamics with a dynamical friction which automatically corrects for the increased noise arising from mini-batching. We investigate the practical relevance of the assumptions underpinning Adaptive Langevin (constant covariance for the estimation of the gradient), which are not satisfied in typical models of Bayesian inference, and quantify the bias induced by minibatching in this case. We also show how to extend AdL in order to systematically reduce the bias on the posterior distribution by considering a dynamical friction depending on the current value of the parameter to sample.
翻译:贝叶斯推论能够获取关于模型参数的有用信息,无论是在计算统计中还是在最近巴伊西亚神经网络中。 通常的蒙特卡洛方法的计算成本是用数据点数线性线性地在巴伊西亚推论尺度上对事后法进行抽样的计算成本。 将这一成本降低到一小部分的一个选择是,与未调整的朗埃文动态分解相结合,采用小型分离法,在这种情况下,只使用随机数据的一部分来估计梯度。 然而,这导致动态中出现更多的噪音,从而对由马尔科夫链抽样的不变化计量产生偏差。 我们主张使用所谓的 " 斯调特维夫·兰格文 " 方法的计算成本,即对标准惯性兰格文动态进行修改,进行动态摩擦,自动纠正因小打而增加的噪音。 我们调查了支持适应性朗埃文动态动态动态动态动态变量(测算的样本变异性)的假设的实际相关性,这些假设在典型的贝伊斯梯度模型中并不满足,因此对由马尔科夫链链系统测测测度度度度度度度度度度的测量度值值值,我们也通过微度分析了当前摩判测测测测测测测测测度。