Recently, neural control policies have outperformed existing model-based planning-and-control methods for autonomously navigating quadrotors through cluttered environments in minimum time. However, they are not perception aware, a crucial requirement in vision-based navigation due to the camera's limited field of view and the underactuated nature of a quadrotor. We propose a learning-based system that achieves perception-aware, agile flight in cluttered environments. Our method combines imitation learning with reinforcement learning (RL) by leveraging a privileged learning-by-cheating framework. Using RL, we first train a perception-aware teacher policy with full-state information to fly in minimum time through cluttered environments. Then, we use imitation learning to distill its knowledge into a vision-based student policy that only perceives the environment via a camera. Our approach tightly couples perception and control, showing a significant advantage in computation speed (10 times faster) and success rate. We demonstrate the closed-loop control performance using hardware-in-the-loop simulation.
翻译:最近,神经控制政策在最短的时间内通过杂乱的环境,优于现有的基于模型的规划和控制方法,在自主导航二次钻探器方面表现得比现有的基于模型的规划和控制方法要好。然而,由于摄像头的视野有限,而且一个二次钻探器的触动性质不足,它们并没有意识到这是基于视觉的导航中的一项关键要求。我们提议了一个基于学习的系统,它能实现感知感知,在杂乱的环境中灵活飞行。我们的方法将模仿学习与强化学习(RL)相结合,方法是利用一个优异的边接式学习框架。我们首先用全状态信息来训练一个具有感知觉的教师政策,以便通过环绕的环境在最短的时间内飞动。然后,我们利用模仿学习将其知识注入基于视觉的学生政策,而这种政策只能通过照相机来感知环境。我们的方法是紧密的夫妇感知与控制,在计算速度(10倍)和成功率方面显示出很大的优势。我们用硬件模拟显示闭路控制功能。</s>