Cooperative autonomous robotic systems have significant potential for executing complex multi-task missions across space, air, ground, and maritime domains. But they commonly operate in remote, dynamic and hazardous environments, requiring rapid in-mission adaptation without reliance on fragile or slow communication links to centralised compute. Fast, on-board replanning algorithms are therefore needed to enhance resilience. Reinforcement Learning shows strong promise for efficiently solving mission planning tasks when formulated as Travelling Salesperson Problems (TSPs), but existing methods: 1) are unsuitable for replanning, where agents do not start at a single location; 2) do not allow cooperation between agents; 3) are unable to model tasks with variable durations; or 4) lack practical considerations for on-board deployment. Here we define the Cooperative Mission Replanning Problem as a novel variant of multiple TSP with adaptations to overcome these issues, and develop a new encoder/decoder-based model using Graph Attention Networks and Attention Models to solve it effectively and efficiently. Using a simple example of cooperative drones, we show our replanner consistently (90% of the time) maintains performance within 10% of the state-of-the-art LKH3 heuristic solver, whilst running 85-370 times faster on a Raspberry Pi. This work paves the way for increased resilience in autonomous multi-agent systems.
翻译:协作式自主机器人系统在执行跨空间、空中、地面及海洋领域的复杂多任务任务方面具有巨大潜力。然而,它们通常在偏远、动态且危险的环境中运行,需要在不依赖脆弱或缓慢的集中计算通信链路的情况下,实现快速的任务内自适应。因此,需要快速机载重规划算法以增强系统的鲁棒性。强化学习在将任务规划问题形式化为旅行商问题时,展现出高效解决的强大潜力,但现有方法存在以下不足:1)不适用于重规划场景,其中智能体并非从单一位置出发;2)不支持智能体间的协作;3)无法建模具有可变持续时间的任务;4)缺乏机载部署的实际考量。本文定义了协作任务重规划问题,作为多旅行商问题的一种新颖变体,并通过适应性改进克服上述问题;同时,开发了一种基于编码器/解码器的新模型,利用图注意力网络和注意力模型,以高效且有效的方式求解该问题。通过一个协作无人机的简单示例,我们证明所提出的重规划器在性能上持续(90%的情况下)保持在最先进的LKH3启发式求解器10%的误差范围内,同时在树莓派上的运行速度提升了85至370倍。这项工作为增强自主多智能体系统的鲁棒性奠定了基础。