Differential dynamic programming (DDP) is a direct single shooting method for trajectory optimization. Its efficiency derives from the exploitation of temporal structure (inherent to optimal control problems) and explicit roll-out/integration of the system dynamics. However, it suffers from numerical instability and, when compared to direct multiple shooting methods, it has limited initialization options (allows initialization of controls, but not of states) and lacks proper handling of control constraints. In this work, we tackle these issues with a feasibility-driven approach that regulates the dynamic feasibility during the numerical optimization and ensures control limits. Our feasibility search emulates the numerical resolution of a direct multiple shooting problem with only dynamics constraints. We show that our approach (named BOX-FDDP) has better numerical convergence than BOX-DDP+ (a single shooting method), and that its convergence rate and runtime performance are competitive with state-of-the-art direct transcription formulations solved using the interior point and active set algorithms available in KNITRO. We further show that BOX-FDDP decreases the dynamic feasibility error monotonically--as in state-of-the-art nonlinear programming algorithms. We demonstrate the benefits of our approach by generating complex and athletic motions for quadruped and humanoid robots. Finally, we highlight that BOX-FDDP is suitable for model predictive control in legged robots.
翻译:不同的动态编程(DDP)是优化轨迹的直接单一射击方法,其效率来自对时间结构的利用(内在至最佳控制问题)和系统动态的明确推出/整合。然而,它受数字不稳定的影响,与直接多射击方法相比,其初始化选项有限(允许控制初始化,而不是国家),缺乏适当的控制约束处理。在这项工作中,我们以一种以可行性为驱动的方法处理这些问题,在数字优化期间规范动态可行性并确保控制限制。我们的可行性搜索效仿了直接多射击问题的数字解决方案,只有动态限制。我们显示我们的方法(名为BOX-DFSP)在数字上比BOX-DDP+(单一射击方法)在数字上更加趋同,而且其趋同率和运行时间性能与使用KNITRO的现有内点和积极设定算法解决的状态直接校正配方相比具有竞争力。我们进一步显示,BOX-FDSP降低了动态可行性错误的单调度,正如在状态的轨道和非线性动态限制一样。我们展示了我们的不相干式机器人组合的模型和硬盘式机器人最终驱动。