We develop a hierarchical controller for multi-agent autonomous racing. A high-level planner approximates the race as a discrete game with simplified dynamics that encodes the complex safety and fairness rules seen in real-life racing and calculates a series of target waypoints. The low-level controller takes the resulting waypoints as a reference trajectory and computes high-resolution control inputs by solving a simplified formulation of a multi-agent racing game. We consider two approaches for the low-level planner to construct two hierarchical controllers. One approach uses multi-agent reinforcement learning (MARL), and the other solves a linear-quadratic Nash game (LQNG) to produce control inputs. We test the controllers against three baselines: an end-to-end MARL controller, a MARL controller tracking a fixed racing line, and an LQNG controller tracking a fixed racing line. Quantitative results show that the proposed hierarchical methods outperform their respective baseline methods in terms of head-to-head race wins and abiding by the rules. The hierarchical controller using MARL for low-level control consistently outperformed all other methods by winning over 88% of head-to-head races and more consistently adhered to the complex racing rules. Qualitatively, we observe the proposed controllers mimicking actions performed by expert human drivers such as shielding/blocking, overtaking, and long-term planning for delayed advantages. We show that hierarchical planning for game-theoretic reasoning produces competitive behavior even when challenged with complex rules and constraints.
翻译:我们开发了多试剂自动赛的上层控制器。 高级规划器将比赛作为分解的游戏, 以简化的动态来分解真实生活中的复杂安全和公平规则, 并计算一系列目标路径点。 低级别控制器将由此产生的路由点作为参考轨迹, 并通过解决多试剂竞赛游戏的简化配制来计算高清晰控制输入。 我们考虑低级别规划器的两个方法, 以构建两个级别控制器。 一种方法是使用多试强化学习( MARL), 另一种是解决一个分线式的游戏( LQNG), 以简化的动态驱动器( LQNC), 以简化的动态和公平规则来编码。 我们测试控制器控制器有三个基线: 末端到端 MARL 控制器控制器, 追踪固定赛线, 并计算出高分辨率控制器输入高分辨率控制器。 定量结果显示, 拟议的分级方法在头赛比赛中超越了各自的基线方法, 赢得并遵守规则 。 级别控制器级控制员使用低层次控制器, 持续地遵循了所有其它方法,, 持续地进行着高级规则, 进行着高级规则,, 持续地进行着高级规则, 不断的逻辑, 进行 以比级规则,,,,, 不断 不断 不断 进行 进行,,,, 以 高级规则,,, 以比 高级规则