Stochastic Gradient (SG-)MCMC methods for sampling statistical distributions approximate gradients by stochastic ones, commonly via uniformly subsampled data points. We propose a non-uniform subsampling scheme to improve the sampling accuracy. The proposed exponentially weighted stochastic gradient (EWSG) is designed so that a non-uniform-SG-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method, and hence the inaccuracy due to SG approximation is reduced. EWSG differs from Variance Reduction (VR) techniques as it focuses on the entire distribution instead of just the variance; nevertheless, its reduced local variance is also proved. EWSG can also be viewed as an extension of the importance sampling idea, successful for SG-based optimizations, to sampling tasks. In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hasting chain on the data index, which is coupled to the MCMC algorithm. Numerical experiments are provided, not only to demonstrate EWSG's effectiveness, but also to guide hyperparameter choices, and validate our \emph{non-asymptotic global error bound} despite of approximations in the implementation. Notably, while statistical accuracy is improved, convergence speed can be comparable to the uniform version, which renders EWSG a practical alternative to VR (but EWSG and VR can be combined too).
翻译:用于抽样统计分布的Stoctic-SG-MCMC(SG-MMC)方法,通常通过统一分抽样的数据点,通过随机测量技术,估计梯度约为梯度。我们提议了一个非统一的子抽样计划,以提高抽样准确性。拟议的超指数加权随机梯度(EWSG)方法的设计是为了让非统一的SG-MC(SGSG-MC)方法模仿分批等级的MC(SG-MC)方法的统计行为,从而减少由于SG接近而造成的不准确性。EWSG(VR)技术不同于差异减少差异技术,因为它侧重于整个分布,而不只是差异;然而,它也证明了地方差异的减少。EWSG(ESG)还被视为扩大重要取样概念的延伸,成功地进行SG(SG)优化,以进行抽样任务。在我们实际实施EWSG(EWSG)方法时,非统一的子抽样调查通过数据指数的Metopolis-Has-Hasing 链进行高效的进行,这与MC算法是相配套的。提供了数字实验,不仅显示EWSG(SG)的精确度和精确度选择,同时也是统计的精确度校正标度校准。