Recent years have seen advances in generalization bounds for noisy stochastic algorithms, especially stochastic gradient Langevin dynamics (SGLD) based on stability (Mou et al., 2018; Li et al., 2020) and information theoretic approaches (Xu and Raginsky, 2017; Negrea et al., 2019; Steinke and Zakynthinou, 2020). In this paper, we unify and substantially generalize stability based generalization bounds and make three technical contributions. First, we bound the generalization error in terms of expected (not uniform) stability which arguably leads to quantitatively sharper bounds. Second, as our main contribution, we introduce Exponential Family Langevin Dynamics (EFLD), a substantial generalization of SGLD, which includes noisy versions of Sign-SGD and quantized SGD as special cases. We establish data-dependent expected stability based generalization bounds for any EFLD algorithm with a O(1/n) sample dependence and dependence on gradient discrepancy rather than the norm of gradients, yielding significantly sharper bounds. Third, we establish optimization guarantees for special cases of EFLD. Further, empirical results on benchmarks illustrate that our bounds are non-vacuous, quantitatively sharper than existing bounds, and behave correctly under noisy labels.
翻译:近些年来,在基于稳定(Mou等人,2018年;Li等人,2020年)和信息理论方法(Xu和Raginsky,2019年;Steinke和Zakynthinou,2020年)的信息理论方法(Xu和Raginsky;Negrea等人,2019年;Steinke和Zakynthinou,2020年)的通用限制方面,出现了一些进展。在本文中,我们统一并在很大程度上概括基于一般化的稳定性限制,作出了三项技术贡献。首先,我们从预期(非统一)稳定(SGELD)稳定(SGLD)的角度约束了一般化的错误,这或许会导致数量上更加清晰的界限。第二,作为我们的主要贡献,我们引入了显性家庭Langevin动态(EFLD)(EFLD)(EFLAD)(EFLA(EF) ),我们为任何EFLD(O (1/n) ) 样本依赖和依赖一般化的基调的基调的基调标准,我们根据现有标准,我们建立了非精确性的标准。