Existing gradient-based optimization methods update parameters locally, in a direction that minimizes the loss function. We study a different approach, symmetry teleportation, that allows parameters to travel a large distance on the loss level set, in order to improve the convergence speed in subsequent steps. Teleportation exploits symmetries in the loss landscape of optimization problems. We derive loss-invariant group actions for test functions in optimization and multi-layer neural networks, and prove a necessary condition for teleportation to improve convergence rate. We also show that our algorithm is closely related to second order methods. Experimentally, we show that teleportation improves the convergence speed of gradient descent and AdaGrad for several optimization problems including test functions, multi-layer regressions, and MNIST classification.
翻译:现有基于梯度的优化方法在当地更新参数, 以最大限度地减少损失功能。 我们研究一种不同的方法, 即对称遥传, 允许参数在损失水平上长距离移动, 以便提高随后步骤的趋同速度 。 远程传送利用了优化问题损失地貌的对称性 。 我们从优化和多层神经网络的测试功能中获取损失- 变量分组行动, 并证明是远程传送提高汇合率的必要条件 。 我们还表明我们的算法与第二顺序方法密切相关 。 实验性地说, 远程传送提高了梯度下降和AdaGrad 的趋同速度, 解决了包括测试功能、 多层回归和 MNIST 分类在内的若干优化问题 。