Catching high-speed targets in the flight is a complex and typical highly dynamic task. In this paper, we propose Catch Planner, a planning-with-decision scheme for catching. For sequential decision making, we propose a policy search method based on deep reinforcement learning. In order to make catching adaptive and flexible, we propose a trajectory optimization method to jointly optimize the highly coupled catching time and terminal state while considering the dynamic feasibility and safety. We also propose a flexible constraint transcription method to catch targets at any reasonable attitude and terminal position bias. The proposed Catch Planner provides a new paradigm for the combination of learning and planning and is integrated on the quadrotor designed by ourselves, which runs at 100$hz$ on the onboard computer. Extensive experiments are carried out in real and simulated scenes to verify the robustness of the proposed method and its expansibility when facing a variety of high-speed flying targets.
翻译:在飞行中捕捉高速目标是一项复杂和典型的高度动态的任务。在本文中,我们提出Catch Planner,这是一个有计划、有决定的抓捕计划。对于顺序决策,我们提出一个基于深层强化学习的政策搜索方法。为了使抓捕具有适应性和灵活性,我们提出一个轨迹优化方法,以便在考虑动态可行性和安全性的同时,共同优化高度结合的抓捕时间和终点状态。我们还提出一个灵活的制约记录方法,以便在任何合理态度和终点位置偏差的情况下抓获目标。拟议的Catch Planner为学习和规划的结合提供了一个新的范例,并融入了由我们自己设计的在机载计算机上以100赫兹计的石器。在真实和模拟场上进行了广泛的实验,以核实拟议方法的稳健性及其在面对各种高速飞行目标时的可扩展性。