Many autonomous agents, such as intelligent vehicles, are inherently required to interact with one another. Game theory provides a natural mathematical tool for robot motion planning in such interactive settings. However, tractable algorithms for such problems usually rely on a strong assumption, namely that the objectives of all players in the scene are known. To make such tools applicable for ego-centric planning with only local information, we propose an adaptive model-predictive game solver, which jointly infers other players' objectives online and computes a corresponding generalized Nash equilibrium (GNE) strategy. The adaptivity of our approach is enabled by a differentiable trajectory game solver whose gradient signal is used for maximum likelihood estimation (MLE) of opponents' objectives. This differentiability of our pipeline facilitates direct integration with other differentiable elements, such as neural networks (NNs). Furthermore, in contrast to existing solvers for cost inference in games, our method handles not only partial state observations but also general inequality constraints. In two simulated traffic scenarios, we find superior performance of our approach over both existing game-theoretic methods and non-game-theoretic model-predictive control (MPC) approaches. We also demonstrate our approach's real-time planning capabilities and robustness in two hardware experiments.
翻译:许多自主代理体,例如智能车辆,本质上需要相互交互。博弈论为机器人在这种交互式环境中进行移动规划提供了一种自然的数学工具。然而,用于解决此类问题的可跟踪算法通常依赖于一个强假设,即所有参与者的目标都是已知的。为使这些工具适用于只具有局部信息的自我中心规划,我们提出了一种自适应模型预测游戏求解器,它在线上联合推断其他玩家的目标并计算相应的广义纳什均衡(GNE)策略。我们的方法的适应性是通过可微分轨迹博弈求解程序实现的,其梯度信号用于对对手目标的最大似然估计(MLE)。我们的流程的可微性有利于与其他可微元素(例如神经网络(NNs))的直接集成。此外,与用于博弈中成本推理的现有求解器不同,我们的方法不仅处理部分状态观察而且处理一般不等式约束。在两个模拟交通场景中,我们发现我们的方法优于现有的博弈理论方法和非博弈理论的模型预测控制(MPC)方法。我们还展示了我们的方法的实时规划能力和在两个硬件实验中的鲁棒性。