头对头自动竞赛的等级控制 (Hierarchical Control for Head-to-Head Autonomous Racing)

We develop a hierarchical controller for head-to-head autonomous racing. We first introduce a formulation of a racing game with realistic safety and fairness rules. A high-level planner approximates the original formulation as a discrete game with simplified state, control, and dynamics to easily encode the complex safety and fairness rules and calculates a series of target waypoints. The low-level controller takes the resulting waypoints as a reference trajectory and computes high-resolution control inputs by solving an alternative formulation with simplified objectives and constraints. We consider two approaches for the low-level planner, constructing two hierarchical controllers. One approach uses multi-agent reinforcement learning (MARL), and the other solves a linear-quadratic Nash game (LQNG) to produce control inputs. The controllers are compared against three baselines: an end-to-end MARL controller, a MARL controller tracking a fixed racing line, and an LQNG controller tracking a fixed racing line. Quantitative results show that the proposed hierarchical methods outperform their respective baseline methods in terms of head-to-head race wins and abiding by the rules. The hierarchical controller using MARL for low-level control consistently outperformed all other methods by winning over 88% of head-to-head races and more consistently adhered to the complex racing rules. Qualitatively, we observe the proposed controllers mimicking actions performed by expert human drivers such as shielding/blocking, overtaking, and long-term planning for delayed advantages. We show that hierarchical planning for game-theoretic reasoning produces competitive behavior even when challenged with complex rules and constraints.

翻译：我们为头对头自动赛开发一个等级控制器。我们首先为头对头自动赛跑开发一个配对游戏的配方, 并采用现实的安全和公平规则。一个高层次计划器将最初的配方当作一个分立游戏, 以简化状态、控制和动态的形式将复杂的安全和公平规则编码, 并计算一系列目标路标点。低层次控制器将由此产生的路标作为参考轨迹, 通过解决具有简化目标和限制的替代配方来计算高分辨率控制投入。我们考虑对低层次计划器采用两种方法, 建造两个等级控制器。一种方法使用多剂强化学习( MARL), 而另一种方法则用直线性平方纳什游戏(LQNG) 来解决线性平方- 规则。控制器对照三个基线进行比较: 一端对端对端 MARL 规则的调控控线, 一个低层次控制器, 以及一个低层次控制器, 提议等级方法超越了各自的基准方法, 。等级方法比先头对头对头和头竞争规则进行更激烈的比对等规则。等级控制等级控制, 等级控制, 高级控制, 以持续地持续地进行, 持续地进行持续地进行压压压压压压压压压压压压压压压压压压压。