变分蒙特卡罗方法的可证收敛 (Provable Convergence of Variational Monte Carlo Methods)

The Variational Monte Carlo (VMC) is a promising approach for computing the ground state energy of many-body quantum problems and attracts more and more interests due to the development of machine learning. The recent paradigms in VMC construct neural networks as trial wave functions, sample quantum configurations using Markov chain Monte Carlo (MCMC) and train neural networks with stochastic gradient descent (SGD) method. However, the theoretical convergence of VMC is still unknown when SGD interacts with MCMC sampling given a well-designed trial wave function. Since MCMC reduces the difficulty of estimating gradients, it has inevitable bias in practice. Moreover, the local energy may be unbounded, which makes it harder to analyze the error of MCMC sampling. Therefore, we assume that the local energy is sub-exponential and use the Bernstein inequality for non-stationary Markov chains to derive error bounds of the MCMC estimator. Consequently, VMC is proven to have a first order convergence rate $O(\log K/\sqrt{n K})$ with $K$ iterations and a sample size $n$. It partially explains how MCMC influences the behavior of SGD. Furthermore, we verify the so-called correlated negative curvature condition and relate it to the zero-variance phenomena in solving eigenvalue functions. It is shown that VMC escapes from saddle points and reaches $(\epsilon,\epsilon^{1/4})$ -approximate second order stationary points or $\epsilon^{1/2}$-variance points in at least $O(\epsilon^{-11/2}\log^{2}(1/\epsilon) )$ steps with high probability. Our analysis enriches the understanding of how VMC converges efficiently and can be applied to general variational methods in physics and statistics.

翻译：变分蒙特卡罗（VMC）是一种有前途的方法，用于计算许多体量子问题的基态能量，由于机器学习的发展，它越来越受到关注。最近在VMC中，使用神经网络作为试探波函数，使用马尔可夫链蒙特卡罗（MCMC）来抽样量子构型，并使用随机梯度下降（SGD）方法训练神经网络。然而，当SGD与设计良好的试探波函数交互时，VMC的理论收敛仍然是未知的。由于MCMC降低了估计梯度的难度，它在实践中不可避免地带有偏差。此外，本地能量可能是无界的，这使得分析MCMC抽样误差更加困难。因此，我们假设本地能量是亚指数级的，并使用非平稳马尔可夫链的Bernstein不等式推导出MCMC估计器的误差界限。结果，证明了VMC具有第一阶收敛速率$O (\log K / \sqrt {n K})$，其中$K$为迭代次数，$n$为样本大小。这在一定程度上解释了MCMC如何影响SGD的行为。此外，我们验证了所谓的相关负曲率条件，并将其与解决特征值函数中的零方差现象相关联。结果表明，VMC从鞍点中逃脱，并以高概率在至少$O(\epsilon^{-11/2}\log^{2}(1/\epsilon) )$步之后到达$(\epsilon,\epsilon^{1/4})$-近似的二阶驻点或$\epsilon^{1/2}$-方差点。我们的分析丰富了对VMC有效收敛的理解，并可应用于物理和统计学中的一般变分方法。