Designing faster optimization algorithms is of ever-growing interest. In recent years, learning to learn methods that learn how to optimize demonstrated very encouraging results. Current approaches usually do not effectively include the dynamics of the optimization process during training. They either omit it entirely or only implicitly assume the dynamics of an isolated parameter. In this paper, we show how to utilize the dynamic mode decomposition method for extracting informative features about optimization dynamics. By employing those features, we show that our learned optimizer generalizes much better to unseen optimization problems in short. The improved generalization is illustrated on multiple tasks where training the optimizer on one neural network generalizes to different architectures and distinct datasets.
翻译:设计更快捷的优化算法越来越令人感兴趣。 近年来, 学习学习如何学习如何优化显示的非常令人鼓舞的结果。 目前的方法通常并不有效地包括培训过程中优化过程的动态。 它们要么完全省略它, 要么只是隐含地假设一个孤立参数的动态。 在本文中, 我们展示了如何使用动态模式分解方法来提取关于优化动态的信息特性。 通过使用这些功能, 我们显示我们所学到的优化方法将简单地说, 更简单地说, 更简单地说, 更简单地说, 更简单地说, 简单地说, 在多个任务上, 培训一个神经网络的优化者, 向不同的结构以及不同的数据集普及。