Multi-agent reinforcement learning (MARL) has emerged as a promising solution for learning complex and scalable coordination behaviors in multi-robot systems. However, established MARL platforms (e.g., SMAC and MPE) lack robotics relevance and hardware deployment, leaving multi-robot learning researchers to develop bespoke environments and hardware testbeds dedicated to the development and evaluation of their individual contributions. The Multi-Agent RL Benchmark and Learning Environment for the Robotarium (MARBLER) is an exciting recent step in providing a standardized robotics-relevant platform for MARL, by bridging the Robotarium testbed with existing MARL software infrastructure. However, MARBLER lacks support for parallelization and GPU/TPU execution, making the platform prohibitively slow compared to modern MARL environments and hindering adoption. We contribute JaxRobotarium, a Jax-powered end-to-end simulation, learning, deployment, and benchmarking platform for the Robotarium. JaxRobotarium enables rapid training and deployment of multi-robot RL (MRRL) policies with realistic robot dynamics and safety constraints, supporting parallelization and hardware acceleration. Our generalizable learning interface integrates easily with SOTA MARL libraries (e.g., JaxMARL). In addition, JaxRobotarium includes eight standardized coordination scenarios, including four novel scenarios that bring established MARL benchmark tasks (e.g., RWARE and Level-Based Foraging) to a robotics setting. We demonstrate that JaxRobotarium retains high simulation fidelity while achieving dramatic speedups over baseline (20x in training and 150x in simulation), and provides an open-access sim-to-real evaluation pipeline through the Robotarium testbed, accelerating and democratizing access to multi-robot learning research and evaluation. Our code is available at https://github.com/GT-STAR-Lab/JaxRobotarium.
翻译:多智能体强化学习(MARL)已成为学习多机器人系统中复杂且可扩展协调行为的有前景解决方案。然而,现有的MARL平台(如SMAC和MPE)缺乏机器人学相关性和硬件部署支持,导致多机器人学习研究者需开发定制化环境与硬件测试平台,专门用于其个体贡献的开发与评估。Robotarium多智能体强化学习基准与学习环境(MARBLER)近期迈出了重要一步,通过将Robotarium测试平台与现有MARL软件基础设施对接,提供了标准化的机器人学相关MARL平台。但MARBLER缺乏并行化与GPU/TPU执行支持,使其运行速度远低于现代MARL环境,阻碍了实际应用。我们提出JaxRobotarium——一个基于Jax的Robotarium端到端仿真、学习、部署与基准测试平台。JaxRobotarium支持并行化与硬件加速,能够基于真实机器人动力学与安全约束,实现多机器人强化学习(MRRL)策略的快速训练与部署。其通用化学习接口可轻松集成先进MARL库(如JaxMARL)。此外,JaxRobotarium包含八个标准化协调场景,其中四个新颖场景将成熟的MARL基准任务(如RWARE和基于等级的觅食)引入机器人学场景。我们证明JaxRobotarium在保持高仿真保真度的同时,相比基线实现了显著加速(训练速度提升20倍,仿真速度提升150倍),并通过Robotarium测试平台提供开放访问的仿真到现实评估流程,加速并普及了多机器人学习研究与评估的开展。代码发布于https://github.com/GT-STAR-Lab/JaxRobotarium。