This paper proposes a new Reinforcement Learning (RL) based control architecture for quadrotors. With the literature focusing on controlling the four rotors' RPMs directly, this paper aims to control the quadrotor's thrust vector. The RL agent computes the percentage of overall thrust along the quadrotor's z-axis along with the desired Roll ($φ$) and Pitch ($θ$) angles. The agent then sends the calculated control signals along with the current quadrotor's Yaw angle ($ψ$) to an attitude PID controller. The PID controller then maps the control signals to motor RPMs. The Soft Actor-Critic algorithm, a model-free off-policy stochastic RL algorithm, was used to train the RL agents. Training results show the faster training time of the proposed thrust vector controller in comparison to the conventional RPM controllers. Simulation results show smoother and more accurate path-following for the proposed thrust vector controller.
翻译:本文提出了一种基于强化学习(RL)的新型四旋翼无人机控制架构。现有研究多集中于直接控制四个旋翼的转速(RPM),而本文旨在控制四旋翼的推力矢量。RL智能体计算沿四旋翼z轴的总推力百分比以及期望的滚转角($φ$)与俯仰角($θ$),随后将计算得到的控制信号与当前偏航角($ψ$)一同发送至姿态PID控制器。PID控制器进而将控制信号映射为电机转速。训练RL智能体采用了Soft Actor-Critic算法——一种无模型、离策略的随机强化学习算法。训练结果表明,与传统转速控制器相比,所提出的推力矢量控制器具有更快的训练速度。仿真实验显示,该推力矢量控制器能实现更平滑、更精确的路径跟踪。