Autonomous drones can operate in remote and unstructured environments, enabling various real-world applications. However, the lack of effective vision-based algorithms has been a stumbling block to achieving this goal. Existing systems often require hand-engineered components for state estimation, planning, and control. Such a sequential design involves laborious tuning, human heuristics, and compounding delays and errors. This paper tackles the vision-based autonomous-drone-racing problem by learning deep sensorimotor policies. We use contrastive learning to extract robust feature representations from the input images and leverage a two-stage learning-by-cheating framework for training a neural network policy. The resulting policy directly infers control commands with feature representations learned from raw images, forgoing the need for globally-consistent state estimation, trajectory planning, and handcrafted control design. Our experimental results indicate that our vision-based policy can achieve the same level of racing performance as the state-based policy while being robust against different visual disturbances and distractors. We believe this work serves as a stepping-stone toward developing intelligent vision-based autonomous systems that control the drone purely from image inputs, like human pilots.
翻译:自主无人驾驶飞机可以在偏远和无结构的环境中运作,使各种现实应用成为可能。然而,缺乏有效的视觉算法一直是实现这一目标的绊脚石。现有系统往往需要手工设计用于国家估计、规划和控制的组成部分。这种顺序设计涉及艰苦的调试、人类的休眠以及复杂的拖延和错误。本文件通过学习深层感官模范政策来解决基于视觉的自主探雷问题。我们利用对比学习从输入图像中提取强健的特征表现,并利用两阶段的边际学习框架来培训神经网络政策。由此产生的政策直接推断出以原始图像所学特征显示的功能显示的控制指令,从而满足对全球一致的国家估计、轨迹规划和手动控制设计的需求。我们的实验结果表明,我们基于视觉的政策可以达到与基于国家的政策相同的赛跑水平,同时对不同的视觉扰动和转移注意力者保持稳健健。我们认为,这项工作可以作为发展智能的视觉自主系统的基础,这种系统可以纯粹从图像投入中控制无人驾驶飞机,例如人类实验。