Purpose: Real-life applications using quadrotors introduce a number of disturbances and time-varying properties that pose a challenge to flight controllers. We observed that, when a quadrotor is tasked with picking up and dropping a payload, traditional PID and RL-based controllers found in literature struggle to maintain flight after the vehicle changes its dynamics due to interaction with this external object. Methods: In this work, we introduce domain randomization during the training phase of a low-level waypoint guidance controller based on Soft Actor-Critic. The resulting controller is evaluated on the proposed payload pick up and drop task with added disturbances that emulate real-life operation of the vehicle. Results & Conclusion: We show that, by introducing a certain degree of uncertainty in quadrotor dynamics during training, we can obtain a controller that is capable to perform the proposed task using a larger variation of quadrotor parameters. Additionally, the RL-based controller outperforms a traditional positional PID controller with optimized gains in this task, while remaining agnostic to different simulation parameters.
翻译:目标:使用四极器的实时应用引入了对飞行控制器构成挑战的一系列扰动和时间变化特性。我们观察到,当一个四极器负责接收和投掷有效载荷时,文献中发现的传统PID和RL控制器在与该外部物体相互作用而使飞行器的动力变化后难以维持飞行。方法:在这项工作中,我们在培训阶段引入了以Soft Actor-Critic为基础的低级中位路点指导控制器的域随机化。由此产生的控制器对拟议的载荷接收和丢弃任务进行了评估,并增加了与该飞行器真实寿命操作相似的干扰。结果和结论:我们表明,通过在培训中对四极体动态进行一定程度的不确定性,我们可以得到一名能够使用更大量的四极参数变量执行拟议任务的控制器。此外,基于RL的控制器在这项工作中超越了传统的定位 PID控制器,并取得了最佳收益,同时保持对不同模拟参数的敏感度。