In this paper, we propose a new approach to learned optimization. As common in the literature, we represent the computation of the update step of the optimizer with a neural network. The parameters of the optimizer are then learned on a set of training optimization tasks, in order to perform minimisation efficiently. Our main innovation is to propose a new neural network architecture for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization approaches, our formulation allows for conditioning across different dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for evaluation of optimization algorithms, as well as on the real world-task of physics-based reconstruction of articulated 3D human motion.
翻译:在本文中,我们提出了一种新的学习优化方法。作为文献中常见的,我们用神经网络来计算优化的更新步骤,然后在一组培训优化任务中学习优化的参数,以便有效地实现最小化。我们的主要创新是为传统BFGS算法所启发的学习优化者提出一个新的神经网络架构。在BFGS中,我们估计了一个先决条件矩阵,作为一级更新的总和,但使用一个基于变压器的神经网络来预测这些更新,同时预测步骤长度和方向。与最近一些学习的优化方法不同,我们的配方允许调整目标问题的参数空间的不同层面,同时仍然适用于不进行再培训的多维性优化任务。我们展示了我们的方法的优势,即由传统上用于评估优化算法的客观功能构成的基准,以及基于物理学的3D人类运动重建的实际世界任务。