Studies that broaden drone applications into complex tasks require a stable control framework. Recently, deep reinforcement learning (RL) algorithms have been exploited in many studies for robot control to accomplish complex tasks. Unfortunately, deep RL algorithms might not be suitable for being deployed directly into a real-world robot platform due to the difficulty in interpreting the learned policy and lack of stability guarantee, especially for a complex task such as a wall-climbing drone. This paper proposes a novel hybrid architecture that reinforces a nominal controller with a robust policy learned using a model-free deep RL algorithm. The proposed architecture employs an uncertainty-aware control mixer to preserve guaranteed stability of a nominal controller while using the extended robust performance of the learned policy. The policy is trained in a simulated environment with thousands of domain randomizations to achieve robust performance over diverse uncertainties. The performance of the proposed method was verified through real-world experiments and then compared with a conventional controller and the state-of-the-art learning-based controller trained with a vanilla deep RL algorithm.
翻译:将无人驾驶飞机应用扩大到复杂任务的研究需要稳定的控制框架。 最近,许多机器人控制研究都利用了深度强化学习(RL)算法来完成复杂任务。 不幸的是,深层RL算法可能不适合直接部署到真实世界的机器人平台,因为难以解释所学的政策,缺乏稳定性保障,特别是难以解释所学的政策,而且缺乏稳定性保障,特别是难以解释墙壁攀登无人驾驶飞机等复杂任务。本文件建议建立一个新型混合结构,通过无模型深度RL算法来强化名义控制器,并学习了强有力的政策。 拟议的结构使用一种具有不确定性的控制器来维护名义控制器的稳定性,同时使用长期有力的学习政策性能。 该政策在模拟环境中培训了数千个域随机化的模拟环境,以便在多种不确定因素中取得稳健的性能。 拟议方法的性能通过现实世界实验得到验证,然后与常规控制器和经过香草岩深度RL算法培训的状态、以学习为基础的控制器进行比较。