Collision avoidance algorithms are of central interest to many drone applications. In particular, decentralized approaches may be the key to enabling robust drone swarm solutions in cases where centralized communication becomes computationally prohibitive. In this work, we draw biological inspiration from flocks of starlings (Sturnus vulgaris) and apply the insight to end-to-end learned decentralized collision avoidance. More specifically, we propose a new, scalable observation model following a biomimetic nearest-neighbor information constraint that leads to fast learning and good collision avoidance behavior. By proposing a general reinforcement learning approach, we obtain an end-to-end learning-based approach to integrating collision avoidance with arbitrary tasks such as package collection and formation change. To validate the generality of this approach, we successfully apply our methodology through motion models of medium complexity, modeling momentum and nonetheless allowing direct application to real world quadrotors in conjunction with a standard PID controller. In contrast to prior works, we find that in our sufficiently rich motion model, nearest-neighbor information is indeed enough to learn effective collision avoidance behavior. Our learned policies are tested in simulation and subsequently transferred to real-world drones to validate their real-world applicability.
翻译:避免碰撞的算法是许多无人机应用的核心利益。 特别是, 分散处理法可能是在中央通信变得计算上令人望而却步的情况下使强健的无人机群群解解决办法的关键。 在这项工作中,我们从星群(Sturnus brugiis)中汲取生物灵感,并运用这种洞察力来避免尾端到端端的分散碰撞。 更具体地说, 我们提出了一个新的、可扩缩的观测模型, 遵循生物模拟近邻近邻信息限制, 导致快速学习和良好的避免碰撞行为。 通过提出一般强化学习方法, 我们获得了一种基于端到端的避免碰撞的方法, 将避免碰撞与任意的任务( 如软件收集和形成变化)结合起来。 为了验证这一方法的普遍性, 我们成功地运用了我们的方法, 采用了中复杂度运动模型, 建模动力, 并允许直接应用到真实世界的解剖器, 与标准的 PID 控制器一起。 与先前的工程不同, 我们发现, 在我们足够丰富的运动模型中, 近邻信息确实足以学习有效的避免碰撞行为。 我们所学的政策在模拟中测试并随后被转移到真实世界验证。