We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network. The parameters of the optimizer are then learned by training on a set of optimization tasks with the objective to perform minimization efficiently. Our innovation is a new neural network architecture, Optimus, for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a Transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization-based approaches, our formulation allows for conditioning across the dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for the evaluation of optimization algorithms, as well as on the real world-task of physics-based visual reconstruction of articulated 3d human motion.
翻译:基于Transformer的学习优化
我们提出了一种新的学习优化方法,其中我们使用神经网络表示优化器更新步骤的计算。然后通过训练一组优化任务并以高效地执行最小化为目标来学习优化器的参数。我们的创新是一种名为 Optimus 的新型神经网络架构,用于学习优化器,其灵感来自于经典的 BFGS 算法。与 BFGS 类似,我们估计了一个预条件矩阵作为一系列秩一更新的和,但是使用基于Transformer的神经网络来预测这些更新以及步长和方向。与最近几种基于学习的优化方法不同,我们的公式允许在目标问题的参数空间的各个维度之间进行调节,而无需重新训练,同时适用于可变维度的优化任务。我们在由传统用于评估优化算法的目标函数组成的基准测试中以及在物理基础的视觉重建人体关节运动的实际任务中展示了我们方法的优势。