The recent advancement of Deep Reinforcement Learning (DRL) contributed to robotics by allowing automatic controller design. The automatic controller design is a crucial approach for designing swarm robotic systems, which require more complex controllers than a single robot system to lead a desired collective behaviour. Although the DRL-based controller design method showed its effectiveness, the reliance on the central training server is a critical problem in real-world environments where robot-server communication is unstable or limited. We propose a novel Federated Learning (FL) based DRL training strategy (FLDDPG) for use in swarm robotic applications. Through the comparison with baseline strategies under a limited communication bandwidth scenario, it is shown that the FLDDPG method resulted in higher robustness and generalisation ability into a different environment and real robots, while the baseline strategies suffer from the limitation of communication bandwidth. This result suggests that the proposed method can benefit swarm robotic systems operating in environments with limited communication bandwidth, e.g., in high-radiation, underwater, or subterranean environments.
翻译:最近推进的深强化学习(DRL)通过允许自动控制器设计为机器人作出了贡献。自动控制器设计是设计群装机器人系统的关键方法,需要比单一机器人系统更复杂的控制器来领导一种理想的集体行为。虽然基于DRL的控制器设计方法显示了其有效性,但对中央培训服务器的依赖是机器人-服务器通信不稳定或有限的现实世界环境中的一个关键问题。我们建议采用基于Federal的基于FL(FLDDPG)的新型DRL培训战略(FLDDPG),用于群装机器人应用。通过在有限的通信带宽情景下与基线战略进行比较,显示FLDDPG方法在不同的环境和真正的机器人中提高了强度和普及能力,而基线战略则受到通信带宽的限制。这说明,拟议的方法有利于在通信带宽有限的环境中运行的群装机器人系统,例如高辐射、水下或亚地环境。