We derive bounds on the path length $\zeta$ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions. Among other results, we prove that: (a) if the iterates are linearly convergent with factor $(1-c)$, then $\zeta$ is at most $\mathcal{O}(1/c)$; (b) under the Polyak-Kurdyka-Lojasiewicz (PKL) condition, $\zeta$ is at most $\mathcal{O}(\sqrt{\kappa})$, where $\kappa$ is the condition number, and at least $\widetilde\Omega(\sqrt{d} \wedge \kappa^{1/4})$; (c) for quadratics, $\zeta$ is $\Theta(\min\{\sqrt{d},\sqrt{\log \kappa}\})$ and in some cases can be independent of $\kappa$; (d) assuming just convexity, $\zeta$ can be at most $2^{4d\log d}$; (e) for separable quasiconvex functions, $\zeta$ is ${\Theta}(\sqrt{d})$. Thus, we advance current understanding of the properties of GD and GF curves beyond rates of convergence. We expect our techniques to facilitate future studies for other algorithms.
翻译:我们从路径长度的 $\ zeta$ 梯度下降 (GD) 和梯度流(GF) 曲线中得出 $\ zeta$ 。 除其他结果外, 我们证明:(a) 如果斜度线性与 $( 1- c) 系数线性趋同, 那么$\ zeta$最多为 $\ mathcal{ O} (1/ c) 美元;(b) 在 Polyak- Kurdyka- Lojasiewicz (PKL) 条件下, $\ zeta$ 最多为 $gmathcal{O} (sqrt\ kapa} (O} (sqrt\ kap_ kapa}) 值, $kappapa 值是, 至少 $\ blobletre= dqprealate $; 在一些案例中, $\\ a fall lex lex a lement $ (r\ d= $) levelop lements lement of we a lementalticlemental) $ (clex) (cle) le)