Autonomous vehicles are suited for continuous area patrolling problems. However, finding an optimal patrolling strategy can be challenging for many reasons. Firstly, patrolling environments are often complex and can include unknown environmental factors. Secondly, autonomous vehicles can have failures or hardware constraints, such as limited battery life. Importantly, patrolling large areas often requires multiple agents that need to collectively coordinate their actions. In this work, we consider these limitations and propose an approach based on model-free, deep multi-agent reinforcement learning. In this approach, the agents are trained to automatically recharge themselves when required, to support continuous collective patrolling. A distributed homogeneous multi-agent architecture is proposed, where all patrolling agents execute identical policies locally based on their local observations and shared information. This architecture provides a fault-tolerant and robust patrolling system that can tolerate agent failures and allow supplementary agents to be added to replace failed agents or to increase the overall patrol performance. The solution is validated through simulation experiments from multiple perspectives, including the overall patrol performance, the efficiency of battery recharging strategies, and the overall fault tolerance and robustness.
翻译:但是,由于多种原因,寻找最佳巡逻战略可能具有挑战性。第一,巡逻环境往往复杂,可能包括未知的环境因素。第二,自治车辆可能有故障或硬件限制,例如电池寿命有限。重要的是,在大片地区巡逻往往需要多个代理人来集体协调行动。在这项工作中,我们考虑到这些局限性,并根据无模式、深层多剂强化学习提出一种方法。在这种方法中,这些代理人接受培训,在需要时自动补给,以支持连续的集体巡逻。提出了分布式的单一多试剂结构,所有巡逻人员根据当地观察和共享的信息在当地执行相同的政策。这一结构提供了一种容错和稳健的巡逻系统,可以容忍代理人的故障,允许补充代理人替换失败的代理人或提高总体巡逻表现。这一解决办法通过从多种角度的模拟实验得到验证,包括总体巡逻表现、电池再充电战略的效率以及总体的错容性和稳健性。