We present PantheonRL, a multiagent reinforcement learning software package for dynamic training interactions such as round-robin, adaptive, and ad-hoc training. Our package is designed around flexible agent objects that can be easily configured to support different training interactions, and handles fully general multiagent environments with mixed rewards and n agents. Built on top of StableBaselines3, our package works directly with existing powerful deep RL algorithms. Finally, PantheonRL comes with an intuitive yet functional web user interface for configuring experiments and launching multiple asynchronous jobs. Our package can be found at https://github.com/Stanford-ILIAD/PantheonRL.
翻译:我们介绍PantheonRL,这是一个多试剂强化学习软件包,用于动态培训互动,如圆柱形、适应性和临时性培训。我们的软件包是围绕易于配置以支持不同培训互动的灵活剂物体设计的,并用混合奖赏和新剂处理完全一般的多剂环境。在StattBaselines3 上建起了我们的软件包,直接与现有的强大的深层RL算法合作。最后,PantheonRL拥有一个直观而实用的网络用户界面,用于配置实验和启动多个非同步工作。我们的软件包可以在 https://github.com/Stanford-LIAD/PantheonRL 上找到。