In this paper, we study the computational complexity of sampling from a Bayesian posterior (or pseudo-posterior) using the Metropolis-adjusted Langevin algorithm (MALA). MALA first applies a discrete-time Langevin SDE to propose a new state, and then adjusts the proposed state using Metropolis-Hastings rejection. Most existing theoretical analysis of MALA relies on the smoothness and strongly log-concavity properties of the target distribution, which unfortunately is often unsatisfied in practical Bayesian problems. Our analysis relies on the statistical large sample theory, which restricts the deviation of the Bayesian posterior from being smooth and log-concave in a very specific manner. Specifically, we establish the optimal parameter dimension dependence of $d^{1/3}$ in the non-asymptotic mixing time upper bound for MALA after the burn-in period without assuming the smoothness and log-concavity of the target posterior density, where MALA is slightly modified by replacing the gradient with any subgradient if non-differentiable. In comparison, the well-known scaling limit for the classical Metropolis random walk (MRW) suggests a linear $d$ dimension dependence in its mixing time. Thus, our results formally verify the conventional wisdom that MALA, as a first-order method using gradient information, is more efficient than MRW as a zeroth-order method only using function value information in the context of Bayesian computation.
翻译:在本文中,我们使用大都会调整的兰氏算法(MALA)研究来自巴伊西亚海边(或伪皮质)取样的计算复杂性。 MALA首先应用离散时间 Langevin SDE 来提议一个新的状态, 然后使用大都会- Hastings 拒绝来调整拟议的状态。 MALA的大多数现有理论分析都依赖于目标分布的平滑性和强烈的日志调和性能, 不幸的是,这在贝伊西亚的实际问题中往往不尽如人意。 我们的分析依赖于统计大样本理论,该理论限制Bayesian海边的偏差以非常具体的方式使Bayesian Rangevin(Langevin SDE)的偏差不光滑和日志调。 具体地说,我们在燃烧期间之后,在不失序的混合时间上限上设定了$d ⁇ 1/3美元的最佳参数依赖度,而没有假设目标海边密度密度的平地( MALA) 略地将梯值替换为(如果是非可理解的亚梯度的话,则使用摩地平流的摩地计算方法) 。