We propose a new framework of variance-reduced Hamiltonian Monte Carlo (HMC) methods for sampling from an $L$-smooth and $m$-strongly log-concave distribution, based on a unified formulation of biased and unbiased variance reduction methods. We study the convergence properties for HMC with gradient estimators which satisfy the Mean-Squared-Error-Bias (MSEB) property. We show that the unbiased gradient estimators, including SAGA and SVRG, based HMC methods achieve highest gradient efficiency with small batch size under high precision regime, and require $\tilde{O}(N + \kappa^2 d^{\frac{1}{2}} \varepsilon^{-1} + N^{\frac{2}{3}} \kappa^{\frac{4}{3}} d^{\frac{1}{3}} \varepsilon^{-\frac{2}{3}} )$ gradient complexity to achieve $\epsilon$-accuracy in 2-Wasserstein distance. Moreover, our HMC methods with biased gradient estimators, such as SARAH and SARGE, require $\tilde{O}(N+\sqrt{N} \kappa^2 d^{\frac{1}{2}} \varepsilon^{-1})$ gradient complexity, which has the same dependency on condition number $\kappa$ and dimension $d$ as full gradient method, but improves the dependency of sample size $N$ for a factor of $N^\frac{1}{2}$. Experimental results on both synthetic and real-world benchmark data show that our new framework significantly outperforms the full gradient and stochastic gradient HMC approaches. The earliest version of this paper was submitted to ICML 2020 with three weak accept but was not finally accepted.
翻译:我们提出一个新的框架,用于根据有偏向和无偏差的减少差异方法的统一配制,从产值低廉的XLmooth 和 $mgy的日志分布采样。我们用梯度估测器研究HMC的趋同特性,这些估测器满足了平均值-夸德-差差-比亚(MSEB)的属性。我们表明,基于HMC方法的公平梯度估测器,包括SAGA和SVRG, 在高精度制度下,小批量的NGGA和SVRG, 达到最高梯度效率, 需要$+ kapapa2 d ⁇ 2 d ⁇ 1\ ⁇ ⁇ ⁇ ⁇ 1 ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ {xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx