Development of guidance, navigation and control frameworks/algorithms for swarms attracted significant attention in recent years. That being said, algorithms for planning swarm allocations/trajectories for engaging with enemy swarms is largely an understudied problem. Although small-scale scenarios can be addressed with tools from differential game theory, existing approaches fail to scale for large-scale multi-agent pursuit evasion (PE) scenarios. In this work, we propose a reinforcement learning (RL) based framework to decompose to large-scale swarm engagement problems into a number of independent multi-agent pursuit-evasion games. We simulate a variety of multi-agent PE scenarios, where finite time capture is guaranteed under certain conditions. The calculated PE statistics are provided as a reward signal to the high level allocation layer, which uses an RL algorithm to allocate controlled swarm units to eliminate enemy swarm units with maximum efficiency. We verify our approach in large-scale swarm-to-swarm engagement simulations.
翻译:近年来,发展对群落的指导、导航和控制框架/参数/参数的开发吸引了相当多的注意力。说到这一点,规划与敌群群接触的群/轨迹分配/轨迹的算法在很大程度上是一个研究不足的问题。虽然可以通过不同游戏理论的工具来解决小规模的情景,但现有方法无法用于大型多剂追逐(PE)场景的大规模多剂追逐(PE)场景。在这项工作中,我们提议了一个基于强化学习(RL)的框架,将大规模群集参与问题分解为若干独立的多剂追逐-蒸发游戏。我们模拟了多种多剂PE场情景,保证在某些条件下有限时间捕捉到。计算出来的PE统计数据是作为奖励信号提供给高层分配层的,该层使用RL算法来分配受控的群温单位,以最大效率消灭敌群温单位。我们验证了大规模群到群集参与模拟中的方法。