Many algorithms for control of multi-robot teams operate under the assumption that low-latency, global state information necessary to coordinate agent actions can readily be disseminated among the team. However, in harsh environments with no existing communication infrastructure, robots must form ad-hoc networks, forcing the team to operate in a distributed fashion. To overcome this challenge, we propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN). Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot. To do this, agents glean information about the topology of the network from packet transmissions and feed it to a GNN running locally which instructs the agent when and where to transmit the latest state information. We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions. Our approach performs favorably compared to industry-standard methods for data distribution such as random flooding and round robin. We also show that the trained policies generalize to larger teams of both static and mobile agents.
翻译:控制多机器人团队的许多算法假设,协调代理行动所必需的低延迟、全球国家信息可以随时在团队中传播,但机器人必须在没有现有通信基础设施的恶劣环境中组成临时热网络,迫使团队以分布式方式运作。为了克服这一挑战,我们提议采用一个任务-敏感、分散、低延迟的方法,用于使用图形神经网络(GNN)在特设网络中进行数据分配。我们的方法使基于全球国家信息的多试算法能够发挥作用,确保每个机器人都能获得这些信息。要做到这一点,代理商收集信息,了解网络的地形学,从包式传输到当地运行的GNNN,指示代理商在何时和何处传输最新状态信息。我们用普通信息年龄作为奖励功能,通过强化学习来培训已分发的GNN通信政策,并表明它比具体任务奖励功能提高了培训稳定性。我们的方法比随机洪水和轮盘抢劫等数据分配行业标准方法要好得多。我们还展示了经过培训的移动代理商的固定式政策,以便向大型团队推广。