Motion planning and control are crucial components of robotics applications. Here, spatio-temporal hard constraints like system dynamics and safety boundaries (e.g., obstacles in automated driving) restrict the robot's motions. Direct methods from optimal control solve a constrained optimization problem. However, in many applications finding a proper cost function is inherently difficult because of the weighting of partially conflicting objectives. On the other hand, Imitation Learning (IL) methods such as Behavior Cloning (BC) provide a intuitive framework for learning decision-making from offline demonstrations and constitute a promising avenue for planning and control in complex robot applications. Prior work primarily relied on soft-constraint approaches, which use additional auxiliary loss terms describing the constraints. However, catastrophic safety-critical failures might occur in out-of-distribution (OOD) scenarios. This work integrates the flexibility of IL with hard constraint handling in optimal control. Our approach constitutes a general framework for constraint robotic motion planning and control using offline IL. Hard constraints are integrated into the learning problem in a differentiable manner, via explicit completion and gradient-based correction. Simulated experiments of mobile robot navigation and automated driving provide evidence for the performance of the proposed method.
翻译:移动规划和控制是机器人应用的关键组成部分。 这里, 系统动态和安全界限( 例如自动驾驶障碍) 等时空硬性硬性约束, 限制了机器人的动作。 最佳控制的直接方法解决了限制优化的问题。 然而, 许多应用中发现适当成本功能的必然困难, 原因是部分相互冲突的目标的权重。 另一方面, 行为克隆( BC) 等模拟学习( IL) 方法为从离线演示中学习决策提供了一个直观的框架, 并且构成了在复杂的机器人应用中进行规划和控制的有希望的渠道。 先前的工作主要依靠软约束方法, 后者使用额外的辅助损失术语描述限制。 然而, 灾难性的安全临界故障可能发生在分配( OOOD) 的假设中。 这项工作将IL 的灵活性和硬性约束处理结合起来, 优化控制。 我们的方法构成了一个限制使用离线的机器人动作( BC) 和控制的一般框架。 硬性制约以不同的方式被纳入学习问题, 通过明确的完成和梯度校正校正。 移动导航和自动驾驶方法的模拟实验提供了拟议的证据。