We present a multi-agent decision-making framework for the emergent coordination of autonomous agents whose intents are initially undecided. Dynamic non-cooperative games have been used to encode multi-agent interaction, but ambiguity arising from factors such as goal preference or the presence of multiple equilibria may lead to coordination issues, ranging from the "freezing robot" problem to unsafe behavior in safety-critical events. The recently developed nonlinear opinion dynamics (NOD) provide guarantees for breaking deadlocks. However, choosing the appropriate model parameters automatically in general multi-agent settings remains a challenge. In this paper, we first propose a novel and principled procedure for synthesizing NOD based on the value functions of dynamic games conditioned on agents' intents. In particular, we provide for the two-player two-option case precise stability conditions for equilibria of the game-induced NOD based on the mismatch between agents' opinions and their game values. We then propose an optimization-based trajectory optimization algorithm that computes agents' policies guided by the evolution of opinions. The efficacy of our method is illustrated with a simulated toll station coordination example.
翻译:通过游戏引发的非线性舆论动力学实现新的协作
我们提出了一个多智能体决策框架,用于自主智能体初始目的不明确时的紧密协作。动态的非合作博弈被用于编码智能体之间的多智能体交互,但来自因素(例如目标偏好或存在多个均衡)的模糊性可能导致协作问题,从“僵硬机器人”问题到在安全关键事件中不安全的行为。最近开发的非线性舆论动力学(NOD)提供了保证打破僵局的方法。但是,在一般的多智能体设置中自动选择适当的模型参数仍然具有挑战性。在本文中,我们首先提出了一种基于智能体意图条件下动态博弈的价值函数综合NOD的新颖和原则性过程。特别是,我们针对两个玩家两个选项的情况,提供了基于智能体意见和他们的博弈价值之间的不匹配的博弈引导NOD的均衡的稳定性条件。然后,我们提出了一种基于优化的轨迹优化算法,通过意见的演化来计算智能体的策略。我们的方法的有效性通过一个模拟的收费站协调示例进行了说明。