We extend the convergence analysis of AdaSLS and AdaSPS in [Jiang and Stich, 2024] to the nonconvex setting, presenting a unified convergence analysis of stochastic gradient descent with adaptive Armijo line-search (AdaSLS) and Polyak stepsize (AdaSPS) for nonconvex optimization. Our contributions include: (1) an $\mathcal{O}(1/\sqrt{T})$ convergence rate for general nonconvex smooth functions, (2) an $\mathcal{O}(1/T)$ rate under quasar-convexity and interpolation, and (3) an $\mathcal{O}(1/T)$ rate under the strong growth condition for general nonconvex functions.
翻译:我们将[Jiang and Stich, 2024]中AdaSLS与AdaSPS的收敛性分析扩展至非凸设定,针对非凸优化问题提出了自适应Armijo线搜索(AdaSLS)与Polyak步长(AdaSPS)随机梯度下降的统一收敛性分析。我们的贡献包括:(1)针对一般非凸光滑函数获得$\mathcal{O}(1/\sqrt{T})$收敛速率,(2)在拟星凸性与插值条件下获得$\mathcal{O}(1/T)$速率,(3)针对一般非凸函数在强增长条件下获得$\mathcal{O}(1/T)$速率。