Over the past few years, the use of swarms of Unmanned Aerial Vehicles (UAVs) in monitoring and remote area surveillance applications has become widespread thanks to the price reduction and the increased capabilities of drones. The drones in the swarm need to cooperatively explore an unknown area, in order to identify and monitor interesting targets, while minimizing their movements. In this work, we propose a distributed Reinforcement Learning (RL) approach that scales to larger swarms without modifications. The proposed framework relies on the possibility for the UAVs to exchange some information through a communication channel, in order to achieve context-awareness and implicitly coordinate the swarm's actions. Our experiments show that the proposed method can yield effective strategies, which are robust to communication channel impairments, and that can easily deal with non-uniform distributions of targets and obstacles. Moreover, when agents are trained in a specific scenario, they can adapt to a new one with minimal additional training. We also show that our approach achieves better performance compared to a computationally intensive look-ahead heuristic.
翻译:过去几年来,无人驾驶航空车辆群(无人驾驶飞行器)在监测和偏远地区监视应用中被广泛使用,这是由于价格下降和无人驾驶飞机能力提高。无人驾驶飞机群中的无人驾驶飞机需要合作探索未知区域,以便识别和监测有趣的目标,同时尽量减少其移动。在这项工作中,我们建议采用分布式强化学习(RL)方法,在不作任何修改的情况下将规模扩大到更大的群。拟议框架依靠无人驾驶飞行器通过通信渠道交流一些信息的可能性,以便实现环境意识并隐含地协调群温行动。我们的实验表明,拟议方法能够产生有效的战略,对沟通频道的缺陷具有很强的影响力,并且能够很容易地处理目标和障碍的非统一分布。此外,当代理人在特定情况下接受培训时,他们可以适应新的,而接受最少的额外培训。我们还表明,我们的方法与计算密集的超前视力相比,能够取得更好的业绩。