Autonomous Racing awards agents that react to opponents' behaviors with agile maneuvers towards progressing along the track while penalizing both over-aggressive and over-conservative agents. Understanding the intent of other agents is crucial to deploying autonomous systems in adversarial multi-agent environments. Current approaches either oversimplify the discretization of the action space of agents or fail to recognize the long-term effect of actions and become myopic. Our work focuses on addressing these two challenges. First, we propose a novel dimension reduction method that encapsulates diverse agent behaviors while conserving the continuity of agent actions. Second, we formulate the two-agent racing game as a regret minimization problem and provide a solution for tractable counterfactual regret minimization with a regret prediction model. Finally, we validate our findings experimentally on scaled autonomous vehicles. We demonstrate that using the proposed game-theoretic planner using agent characterization with the objective space significantly improves the win rate against different opponents, and the improvement is transferable to unseen opponents in an unseen environment.
翻译:自动竞赛授标代理对反对者的行为作出反应,以灵活机动的手法在轨道上取得进展,同时惩罚过度侵略和过度保守的代理人。理解其他代理人的意图对于在对抗性多试剂环境中部署自主系统至关重要。当前的做法要么过度简化代理人行动空间的分散化,要么不承认行动的长期影响,变成短视。我们的工作重点是应对这两个挑战。首先,我们提议一种新颖的减少维度方法,在保护代理人行动的连续性的同时,将不同的代理人行为包含在内。第二,我们把双试赛作为尽量减少遗憾的问题,并以遗憾预测模型为可移植反事实的减少遗憾提供一个解决办法。最后,我们通过实验性地验证我们在规模化自主飞行器上的调查结果。我们证明,使用拟议的游戏理论规划师对客观空间的定性,可以极大地改善对不同对手的赢率,改进可以在看不见的环境中向看不见的反对者转移。