Many robotics domains use some form of nonconvex model predictive control (MPC) for planning, which sets a reduced time horizon, performs trajectory optimization, and replans at every step. The actual task typically requires a much longer horizon than is computationally tractable, and is specified via a cost function that cumulates over that full horizon. For instance, an autonomous car may have a cost function that makes a desired trade-off between efficiency, safety, and obeying traffic laws. In this work, we challenge the common assumption that the cost we optimize using MPC should be the same as the ground truth cost for the task (plus a terminal cost). MPC solvers can suffer from short planning horizons, local optima, incorrect dynamics models, and, importantly, fail to account for future replanning ability. Thus, we propose that in many tasks it could be beneficial to purposefully choose a different cost function for MPC to optimize: one that results in the MPC rollout having low ground truth cost, rather than the MPC planned trajectory. We formalize this as an optimal cost design problem, and propose a zeroth-order optimization-based approach that enables us to design optimal costs for an MPC planning robot in continuous MDPs. We test our approach in an autonomous driving domain where we find costs different from the ground truth that implicitly compensate for replanning, short horizon, incorrect dynamics models, and local minima issues. As an example, the learned cost incentivizes MPC to delay its decision until later, implicitly accounting for the fact that it will get more information in the future and be able to make a better decision. Code and videos available at https://sites.google.com/berkeley.edu/ocd-mpc/.
翻译:许多机器人域使用某种形式的非convex模型预测控制(MPC)来进行规划,这种控制可以降低时间范围,进行轨迹优化,并在每步都进行重新规划。实际任务通常需要比可计算范围长得多的地平线,并且通过在全地平线上累积的成本函数来指定。例如,自主汽车可能具有成本功能,从而在效率、安全和遵守交通法之间实现预期的权衡。在这项工作中,我们质疑一个共同的假设,即我们优化使用MPC的成本应该与任务地面真相成本(加上终端成本)相同。 MPC解算器的解算器可能因为短的规划地平线、本地opima、不正确的动态模型而受到影响,而且重要的是,无法对未来的再规划能力进行核算。因此,我们建议,在许多任务中,自主汽车可以有意选择一种不同的成本函数来优化 MPC : 使MPC 推出的地面真相成本成本低,而不是MPC 计划轨迹。我们把它正规化为最佳的成本设计问题,并提议一种基于不精确的优化的模型,让我们在未来的MC 成本设计一个不固定的计算成本,让我们在不同的轨道上进行最优化的计算。