通过强化学习赛车赛车车动态的线性化 (Feedback Linearization of Car Dynamics for Racing via Reinforcement Learning)

Through the method of Learning Feedback Linearization, we seek to learn a linearizing controller to simplify the process of controlling a car to race autonomously. A soft actor-critic approach is used to learn a decoupling matrix and drift vector that effectively correct for errors in a hand-designed linearizing controller. The result is an exactly linearizing controller that can be used to enable the well-developed theory of linear systems to design path planning and tracking schemes that are easy to implement and significantly less computationally demanding. To demonstrate the method of feedback linearization, it is first used to learn a simulated model whose exact structure is known, but varied from the initial controller, so as to introduce error. We further seek to apply this method to a system that introduces even more error in the form of a gym environment specifically designed for modeling the dynamics of car racing. To do so, we posit an extension to the method of learning feedback linearization; a neural network that is trained using supervised learning to convert the output of our linearizing controller to the required input for the racing environment. Our progress towards these goals is reported and the next steps in their accomplishment are discussed.

翻译：通过“学习反馈线性化”方法,我们试图学习一个线性控制器,以简化控制汽车自动赛车的过程。使用软性行为者-加速法,学习一个脱钩矩阵和漂移矢量,以有效纠正手工设计的线性控制器中的错误。结果是一个完全线性控制器,可以用来使精密的线性系统理论能够设计易于执行和大大降低计算要求的路径规划和跟踪计划。为了演示反馈线性化方法,它首先用来学习一个模拟模型,它的确切结构是已知的,但与初始控制器不同,从而引入错误。我们进一步寻求将这种方法应用于一个系统,以专门设计用于模拟汽车赛车动态的体操环境的形式造成更多的错误。为此,我们扩展了学习反馈线性线性化方法;一个神经网络,经过培训,利用监督性学习将我们线性控制器的输出转换为对赛车环境所需的投入。我们报告了实现这些目标的进展,并讨论了其完成的下一步步骤。