This paper addresses the problem of traversing through unknown, tilted, and narrow gaps for quadrotors using Deep Reinforcement Learning (DRL). Previous learning-based methods relied on accurate knowledge of the environment, including the gap's pose and size. In contrast, we integrate onboard sensing and detect the gap from a single onboard camera. The training problem is challenging for two reasons: a precise and robust whole-body planning and control policy is required for variable-tilted and narrow gaps, and an effective Sim2Real method is needed to successfully conduct real-world experiments. To this end, we propose a learning framework for agile gap traversal flight, which successfully trains the vehicle to traverse through the center of the gap at an approximate attitude to the gap with aggressive tilted angles. The policy trained only in a simulation environment can be transferred into different domains with fine-tuning while maintaining the success rate. Our proposed framework, which integrates onboard sensing and a neural network controller, achieves a success rate of 84.51% in real-world experiments, with gap orientations up to 60deg. To the best of our knowledge, this is the first paper that performs the learning-based variable-tilted narrow gap traversal flight in the real world, without prior knowledge of the environment.
翻译:本文探讨的是利用深强化学习(DRL)通过未知的、倾斜的和缩小的四履带体差距的问题。 以往的学习方法依赖于对环境的准确知识,包括差距的构成和大小。 相反,我们整合了机载感测,并从一个机载摄像头中探测出差距。 培训问题具有挑战性,原因有二:对于变化和缩小的差距,需要精确和健全的整体规划和控制政策,而成功进行现实世界实验需要有效的Sim2Real方法。 为此,我们提议了一个灵活差距跨行飞行学习框架,它成功地以对差距中心的大致态度对差距进行穿行培训,同时以积极倾斜的角度对差距进行穿行。只有模拟环境中培训的政策才能转移到不同的领域,在保持成功率的同时进行微调。我们提议的框架将机载感测和神经网络控制整合在一起,在现实世界实验中成功率为84.51%,差距定位达60deg。对于我们的知识中最先进的是,在实际飞行环境中最先进的是,先是学习之前的论文。