We establish a sharp uniform-in-time error estimate for the Stochastic Gradient Langevin Dynamics (SGLD), which is a popular sampling algorithm. Under mild assumptions, we obtain a uniform-in-time $O(\eta^2)$ bound for the KL-divergence between the SGLD iteration and the Langevin diffusion, where $\eta$ is the step size (or learning rate). Our analysis is also valid for varying step sizes. Based on this, we are able to obtain an $O(\eta)$ bound for the distance between the SGLD iteration and the invariant distribution of the Langevin diffusion, in terms of Wasserstein or total variation distances.
翻译:我们为Stochastic Gradient Langevin Dynamics(SGLD)(SGLD)(SGLD)(SGLD)(SGLD)(SGLD)(SGLD)(SGLD)(SGLD))(SGLD)(SGLD)(SGLD)(SGLD))(SGLD)(SGLD)(SGLD)(SGLD)(SGLD) (SGLD) (SGLD) (SGLD) (SGLD) (SGLD) (SG) (SGLD(SGLD) (SG) (SGLD) (S) (SGLD) (SGLD(S) (SGLD) (SGLD) (SG) (S) (S) (SGLED) ((S) (S) ((SGLD) (SGLD) ((SGLD) (S) (S) (S(S) (S) (S) (S) (S) (S) (S) (S) (S) (S) ((SGLDDDD) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S(S) (S) (S) (S) (S) (SDDDDDD) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S)) (S) (S) ((S) (S) (S) (S) (S) (S) (SB) (S)) (S) (S) (S) (S) (S) (S) (S) (S) ) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S) (S) ((S) (() ((S