多指数多指数抗药性蒸汽梯度算法 (Multi-index Antithetic Stochastic Gradient Algorithm)

from arxiv, 51 pages, 8 figures. Revised version: an improved introduction, a completely new numerical section including experiments in non-convex settings, a new appendix discussing the dependence of the variance of SGLD on the mini-batch size

Stochastic Gradient Algorithms (SGAs) are ubiquitous in computational statistics, machine learning and optimisation. Recent years have brought an influx of interest in SGAs, and the non-asymptotic analysis of their bias is by now well-developed. However, relatively little is known about the optimal choice of the random approximation (e.g mini-batching) of the gradient in SGAs as this relies on the analysis of the variance and is problem specific. While there have been numerous attempts to reduce the variance of SGAs, these typically exploit a particular structure of the sampled distribution by requiring a priori knowledge of its density's mode. It is thus unclear how to adapt such algorithms to non-log-concave settings. In this paper, we construct a Multi-index Antithetic Stochastic Gradient Algorithm (MASGA) whose implementation is independent of the structure of the target measure and which achieves performance on par with Monte Carlo estimators that have access to unbiased samples from the distribution of interest. In other words, MASGA is an optimal estimator from the mean square error-computational cost perspective within the class of Monte Carlo estimators. We prove this fact rigorously for log-concave settings and verify it numerically for some examples where the log-concavity assumption is not satisfied.

翻译：虽然在计算统计、机器学习和优化方面有很多减少 SGA 差异的尝试,但通常会利用抽样分布的特定结构,要求事先了解其密度模式。因此,目前尚不清楚如何将这种算法调整到非log-colave 的设置。在本文中,我们构建了一个多指数抗反热性温度梯度(MASAGA)的最佳选择,因为其实施独立于目标测量的结构,并实现与蒙特卡洛估计者相当的绩效,该估计者可以从其密度模式的分布中获得公正样品。在其它词语中,MASGA是一个最理想的日历模型,从这个模型中可以找到一个最理想的模型。