Choosing the right parameters for optimization algorithms is often the key to their success in practice. Solving this problem using a learning-to-learn approach -- using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates -- was recently shown to be effective. However, the meta-optimization problem is difficult. In particular, the meta-gradient can often explode/vanish, and the learned optimizer may not have good generalization performance if the meta-objective is not chosen carefully. In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that the na\"ive objective suffers from meta-gradient explosion/vanishing problem. Although there is a way to design the meta-objective so that the meta-gradient remains polynomially bounded, computing the meta-gradient directly using backpropagation leads to numerical issues. We also characterize when it is necessary to compute the meta-objective on a separate validation set to ensure the generalization performance of the learned optimizer. Finally, we verify our results empirically and show that a similar phenomenon appears even for more complicated learned optimizers parametrized by neural networks.
翻译:选择优化算法的正确参数往往是其实践中成功的关键。 使用学习到学习的方法解决这个问题 -- -- 根据优化生成的轨迹,在元目标上使用元向梯级下降法 -- -- 在最近显示效果。 然而, 元优化问题是困难的。 特别是, 元升级法往往会爆炸/ 蒸发, 所学的优化器可能没有良好的概括性表现, 如果不仔细选择元目标。 在本文中, 我们给学习到学习的方法提供元优化保证, 在一个简单的问题上, 即调整梯度损失的阶梯尺寸, 使用元向梯级下降法。 我们的结果显示, 逆位目标来自元升级的爆炸/衰减问题。 虽然有办法设计元目标, 使元升级法仍然具有多面性, 直接计算元升级法, 导致数字问题。 在需要对元升级方法进行元化保证时, 我们还需要将元数据配置在单独的校准器上, 以确保我们所学到的最优化的网络的复杂性表现。 最后, 我们通过更复杂的实验性地展示我们所学到的最优化的网络。