Prior work (Klochkov $\&$ Zhivotovskiy, 2021) establishes at most $O\left(\log (n)/n\right)$ excess risk bounds via algorithmic stability for strongly-convex learners with high probability. We show that under the similar common assumptions -- - Polyak-Lojasiewicz condition, smoothness, and Lipschitz continous for losses -- - rates of $O\left(\log^2(n)/n^2\right)$ are at most achievable. To our knowledge, our analysis also provides the tightest high-probability bounds for gradient-based generalization gaps in nonconvex settings.
翻译:先前研究(Klochkov & Zhivotovskiy, 2021)通过算法稳定性为强凸学习器建立了高概率下至多 $O\left(\log (n)/n\right)$ 的过剩风险界。我们证明,在类似常见假设——损失函数满足 Polyak-Lojasiewicz 条件、光滑性及 Lipschitz 连续性——下,至多可实现 $O\left(\log^2(n)/n^2\right)$ 的速率。据我们所知,我们的分析还为非凸场景中基于梯度的泛化间隙提供了最紧的高概率界。