Trajectory optimization is an efficient approach for solving optimal control problems for complex robotic systems. It relies on two key components: first the transcription into a sparse nonlinear program, and second the corresponding solver to iteratively compute its solution. On one hand, differential dynamic programming (DDP) provides an efficient approach to transcribe the optimal control problem into a finite-dimensional problem while optimally exploiting the sparsity induced by time. On the other hand, augmented Lagrangian methods make it possible to formulate efficient algorithms with advanced constraint-satisfaction strategies. In this paper, we propose to combine these two approaches into an efficient optimal control algorithm accepting both equality and inequality constraints. Based on the augmented Lagrangian literature, we first derive a generic primal-dual augmented Lagrangian strategy for nonlinear problems with equality and inequality constraints. We then apply it to the dynamic programming principle to solve the value-greedy optimization problems inherent to the backward pass of DDP, which we combine with a dedicated globalization strategy, resulting in a Newton-like algorithm for solving constrained trajectory optimization problems. Contrary to previous attempts of formulating an augmented Lagrangian version of DDP, our approach exhibits adequate convergence properties without any switch in strategies. We empirically demonstrate its interest with several case-studies from the robotics literature.
翻译:轨迹优化是解决复杂机器人系统最佳控制问题的高效方法。 它依赖于两个关键组成部分: 首先, 转换成一个稀疏的非线性程序, 其次, 相应的解析器可以反复计算其解决方案。 一方面, 差异动态编程( DDP) 提供了一种高效的方法, 将最佳控制问题转换成一个有限层面的问题, 同时最佳地利用时间引起的偏狭。 另一方面, 增强拉格朗格方法使得有可能制定高效算法, 并采用先进的约束性满意度战略。 在本文件中, 我们提议将这两种方法合并成一个有效的最佳控制算法, 接受平等和不平等的制约。 在拉格朗格文献的扩充基础上, 我们首先为将非线性控制问题和不平等制约的拉格朗格战略( DDP ) 转换为通用的原始强化拉格朗格战略。 然后, 我们将其应用于动态编程原则, 以解决DDP落后通道所固有的价值- 优化问题, 我们将其与专门的全球化战略结合起来, 从而形成一种像新通式的算法, 解决受限制的轨迹性优化问题。 与以前尝试式的集式的尝试不同。