Attitude control of fixed-wing unmanned aerial vehicles (UAVs)is a difficult control problem in part due to uncertain nonlinear dynamics, actuator constraints, and coupled longitudinal and lateral motions. Current state-of-the-art autopilots are based on linear control and are thus limited in their effectiveness and performance. Deep reinforcement learning (DRL) is a machine learning method to automatically discover optimal control laws through interaction with the controlled system, that can handle complex nonlinear dynamics. We show in this paper that DRL can successfully learn to perform attitude control of a fixed-wing UAV operating directly on the original nonlinear dynamics, requiring as little as three minutes of flight data. We initially train our model in a simulation environment and then deploy the learned controller on the UAV in flight tests, demonstrating comparable performance to the state-of-the-art ArduPlaneproportional-integral-derivative (PID) attitude controller with no further online learning required. To better understand the operation of the learned controller we present an analysis of its behaviour, including a comparison to the existing well-tuned PID controller.
翻译:固定翼无人驾驶航空器(UAVs)的姿态控制是一个困难的控制问题,部分原因是由于不确定的非线性动态、动画限制以及纵向和横向运动,目前最先进的自动驾驶以线性控制为基础,因此其效力和性能有限。深强化学习(DRL)是一种机器学习方法,通过与受控系统的互动,自动发现最佳控制法,这种系统能够处理复杂的非线性动态。我们在本文件中显示,DRL能够成功地学会对在原始非线性动态上直接运行的固定翼无人驾驶飞行器进行姿态控制,只需要3分钟飞行数据。我们最初在模拟环境中培训我们的模型,然后在飞行测试中部署UAV上学习的指挥员,显示其与最先进的ArduPlane比例-内分流式姿态控制器(PID)的可比较性能,而无需进一步在线学习。为了更好地了解已学的控制员的运作情况,我们对其行为进行了分析,包括与现有的调整良好的PID控制员进行比较。