多式最大 Entropy 动态运动会 (Multimodal Maximum Entropy Dynamic Games)

Environments with multi-agent interactions often result a rich set of modalities of behavior between agents due to the inherent suboptimality of decision making processes when agents settle for satisfactory decisions. However, existing algorithms for solving these dynamic games are strictly unimodal and fail to capture the intricate multimodal behaviors of the agents. In this paper, we propose MMELQGames (Multimodal Maximum-Entropy Linear Quadratic Games), a novel constrained multimodal maximum entropy formulation of the Differential Dynamic Programming algorithm for solving generalized Nash equilibria. By formulating the problem as a certain dynamic game with incomplete and asymmetric information where agents are uncertain about the cost and dynamics of the game itself, the proposed method is able to reason about multiple local generalized Nash equilibria, enforce constraints with the Augmented Lagrangian framework and also perform Bayesian inference on the latent mode from past observations. We assess the efficacy of the proposed algorithm on two illustrative examples: multi-agent collision avoidance and autonomous racing. In particular, we show that only MMELQGames is able to effectively block a rear vehicle when given a speed disadvantage and the rear vehicle can overtake from multiple positions.

翻译：多试剂相互作用的环境往往导致代理商之间行为模式的丰富,这是因为代理商在满足满意的决定时决定程序本身不够优化,决策程序本身不够优化。然而,现有的解决这些动态游戏的算法完全是单式的,无法捕捉这些代理商复杂的多式联运行为。在本文中,我们提议MMELQGames(Multimodal 最大-Entropy Linesar Quabarratic运动会),这是为解决普世纳什平衡而采用的不同动态动态编程算法的一种新颖的多式最大倍数公式。通过将这一问题发展成一个具有不完整和不对称信息的动态游戏,使代理商对游戏本身的成本和动态不确定,拟议的方法能够解释多种本地通用的纳什平衡,在增强拉格朗江框架下实施限制,并且从以往的观察中推断出Bayesian对潜在模式的推论。我们根据两个示例评估了拟议的算法的有效性:多试碰撞避免和自主赛。我们特别表明,只有MMEQGames能够有效地阻挡后方车辆,而后方处于多重劣势。