We study the class of reach-avoid dynamic games in which multiple agents interact noncooperatively, and each wishes to satisfy a distinct target condition while avoiding a failure condition. Reach-avoid games are commonly used to express safety-critical optimal control problems found in mobile robot motion planning. While a wide variety of approaches exist for these motion planning problems, we focus on finding time-consistent solutions, in which planned future motion is still optimal despite prior suboptimal actions. Though abstract, time consistency encapsulates an extremely desirable property: namely, time-consistent motion plans remain optimal even when a robot's motion diverges from the plan early on due to, e.g., intrinsic dynamic uncertainty or extrinsic environment disturbances. Our main contribution is a computationally-efficient algorithm for multi-agent reach-avoid games which renders time-consistent solutions. We demonstrate our approach in a simulated driving scenario, where we construct a two-player adversarial game to model a range of defensive driving behaviors.
翻译:我们研究的是“达到-避免”的动态游戏,其中多个代理人不合作地互动,每个代理人都希望满足一个不同的目标条件,同时避免失败条件。“达到-避免”游戏通常用来表达移动机器人运动规划中发现的安全关键最佳控制问题。虽然在这些运动规划问题方面存在着各种各样的办法,但我们侧重于寻找时间一致的解决办法,在这种办法中,计划的未来运动尽管在前几个最优的行动中仍然最理想。尽管时间一致包涵了一种非常可取的属性:即时间一致的动作计划仍然是最佳的,即使机器人的动作与计划有差异,例如,由于内在的动态不确定性或极端环境的干扰。我们的主要贡献是多试剂接触-避免游戏的计算效率算法,这种算法使得时间一致的解决办法。我们在模拟的驱动情景中展示了我们的方法,在模拟的驱动情景中我们构建了一种双人对抗游戏,以模拟一系列防御性驱动行为。