Traversing through a tilted narrow gap is previously an intractable task for reinforcement learning mainly due to two challenges. First, searching feasible trajectories is not trivial because the goal behind the gap is difficult to reach. Second, the error tolerance after Sim2Real is low due to the relatively high speed in comparison to the gap's narrow dimensions. This problem is aggravated by the intractability of collecting real-world data due to the risk of collision damage. In this paper, we propose an end-to-end reinforcement learning framework that solves this task successfully by addressing both problems. To search for dynamically feasible flight trajectories, we use curriculum learning to guide the agent towards the sparse reward behind the obstacle. To tackle the Sim2Real problem, we propose a Sim2Real framework that can transfer control commands to a real quadrotor without using real flight data. To the best of our knowledge, our paper is the first work that accomplishes successful gap traversing task purely using deep reinforcement learning.
翻译:通过倾斜的狭小差距进行探索以前是强化学习的一个棘手任务,主要由于两个挑战。首先,寻找可行的飞行轨迹并非微不足道,因为差距背后的目标难以达到。第二,Sim2Real之后的差错容忍度较低,因为与差距的狭小尺寸相比,速度相对较快。由于碰撞破坏的风险,收集真实世界数据难以吸引,这一问题变得更加严重。在本文件中,我们提议了一个端到端的强化学习框架,通过解决这两个问题成功地解决了这项任务。为了寻找动态可行的飞行轨迹,我们利用课程学习来引导代理人找到障碍背后的微薄奖励。为了解决Sim2Real问题,我们提议了一个Sim2Real框架,可以将控制命令转移到真正的二次钻探场,而不用实际的飞行数据。据我们所知,我们的论文是仅利用深层强化学习就能成功完成差距跨越任务的第一个工作。