We describe the first gradient methods on Riemannian manifolds to achieve accelerated rates in the non-convex case. Under Lipschitz assumptions on the Riemannian gradient and Hessian of the cost function, these methods find approximate first-order critical points faster than regular gradient descent. A randomized version also finds approximate second-order critical points. Both the algorithms and their analyses build extensively on existing work in the Euclidean case. The basic operation consists in running the Euclidean accelerated gradient descent method (appropriately safe-guarded against non-convexity) in the current tangent space, then moving back to the manifold and repeating. This requires lifting the cost function from the manifold to the tangent space, which can be done for example through the Riemannian exponential map. For this approach to succeed, the lifted cost function (called the pullback) must retain certain Lipschitz properties. As a contribution of independent interest, we prove precise claims to that effect, with explicit constants. Those claims are affected by the Riemannian curvature of the manifold, which in turn affects the worst-case complexity bounds for our optimization algorithms.
翻译:我们描述里曼尼方块的第一种梯度方法, 以在非电流情况下加速速度。 在利普西茨假设里格曼梯度和成本函数的赫西安假设下, 这些方法发现大约第一阶临界点比正常梯度下降速度快。 一个随机化版本还发现大约第二阶临界点。 算法及其分析都广泛建立在欧几里德案例的现有工作基础上。 基本操作是在当前正切空间中运行欧几里德加速梯度下降法( 以适当安全的方式防止非电流下降), 然后再回到多元和重复。 这需要将成本功能从元件提升到正切空间, 这可以通过里曼指数地图进行。 要取得成功, 取消成本功能( 称为拉回) 必须保留某些利普西茨的属性。 作为独立利益的贡献, 我们证明对这个效果有准确的主张, 并且有明确的恒定。 这些主张受到马力的里曼曲面的影响, 从而反过来影响我们最复杂程度的缩算法。