Targets search and detection encompasses a variety of decision problems such as coverage, surveillance, search, observing and pursuit-evasion along with others. In this paper we develop a multi-agent deep reinforcement learning (MADRL) method to coordinate a group of aerial vehicles (drones) for the purpose of locating a set of static targets in an unknown area. To that end, we have designed a realistic drone simulator that replicates the dynamics and perturbations of a real experiment, including statistical inferences taken from experimental data for its modeling. Our reinforcement learning method, which utilized this simulator for training, was able to find near-optimal policies for the drones. In contrast to other state-of-the-art MADRL methods, our method is fully decentralized during both learning and execution, can handle high-dimensional and continuous observation spaces, and does not require tuning of additional hyperparameters.
翻译:目标的搜索和探测包括各种决策问题,如覆盖范围、监视、搜索、观察和追逐避险等。在本文件中,我们开发了一种多剂深度强化学习(MADRL)方法,以协调一组航空飞行器(drones),目的是在未知地区定位一组静态目标。为此,我们设计了一个现实的无人机模拟器,复制实际实验的动态和扰动,包括从实验数据中提取的用于模型的统计推论。我们使用这一模拟器进行训练的强化学习方法能够找到近乎最佳的无人机政策。与其他最先进的MADRL方法不同,我们的方法在学习和执行期间完全分散,可以处理高维和连续观测空间,不需要调整额外的超参数。