We propose a novel approach to analyze generalization error for discretizations of Langevin diffusion, such as the stochastic gradient Langevin dynamics (SGLD). For an $\epsilon$ tolerance of expected generalization error, it is known that a first order discretization can reach this target if we run $\Omega(\epsilon^{-1} \log (\epsilon^{-1}) )$ iterations with $\Omega(\epsilon^{-1})$ samples. In this article, we show that with additional smoothness assumptions, even first order methods can achieve arbitrarily runtime complexity. More precisely, for each $N>0$, we provide a sufficient smoothness condition on the loss function such that a first order discretization can reach $\epsilon$ expected generalization error given $\Omega( \epsilon^{-1/N} \log (\epsilon^{-1}) )$ iterations with $\Omega(\epsilon^{-1})$ samples.
翻译:我们提出一种新的方法来分析朗埃文扩散的离散性差错,例如Stochistic 梯度Langevin动态(SGLD)等。对于预期普遍化差错的容度,我们知道,如果我们用$Omega(\epsilon ⁇ -1}\log(\epsilon ⁇ -1})来运行以$Omega(\epsilon ⁇ -1})为样本的折叠性差错,则第一级离散性就能够达到这个目标。在文章中,我们表明,如果增加顺畅性假设,即使第一级方法也能实现任意运行时间的复杂性。更确切地说,对于每1美元,我们为损失函数提供了足够的顺畅性条件,以便第一级离异性能达到美元预期的普遍差错,给$Omega(\epsilon ⁇ -1}(\epsilon ⁇ -1}(\\\ ipsilon ⁇ -1})。