MetDrive:综合可普遍加强学习的各种驱动情景 (MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning)

from arxiv, Source code, documentation, and demo video are available at https://metadriverse.github.io/metadrive . More research projects based on MetaDrive simulator are listed at https://metadriverse.github.io

Driving safely requires multiple capabilities from human and intelligent agents, such as the generalizability to unseen environments, the safety awareness of the surrounding traffic, and the decision-making in complex multi-agent settings. Despite the great success of Reinforcement Learning (RL), most of the RL research works investigate each capability separately due to the lack of integrated environments. In this work, we develop a new driving simulation platform called MetaDrive to support the research of generalizable reinforcement learning algorithms for machine autonomy. MetaDrive is highly compositional, which can generate an infinite number of diverse driving scenarios from both the procedural generation and the real data importing. Based on MetaDrive, we construct a variety of RL tasks and baselines in both single-agent and multi-agent settings, including benchmarking generalizability across unseen scenes, safe exploration, and learning multi-agent traffic. The generalization experiments conducted on both procedurally generated scenarios and real-world scenarios show that increasing the diversity and the size of the training set leads to the improvement of the generalizability of the RL agents. We further evaluate various safe reinforcement learning and multi-agent reinforcement learning algorithms in MetaDrive environments and provide the benchmarks. Source code, documentation, and demo video are available at https://metadriverse.github.io/metadrive . More research projects based on MetaDrive simulator are listed at https://metadriverse.github.io

翻译：安全驾驶需要来自人类和智能代理人的多种能力,例如:普通到看不见的环境、对周围交通的安全认识以及复杂多剂环境下的决策。尽管加强学习(RL)取得了巨大成功,但大多数RL的研究工作都因缺乏综合环境而分别调查每一种能力。在这项工作中,我们开发了一个名为MetaDrive的新的驱动模拟平台,以支持对可普遍适用的机械自主强化学习算法的研究。MetaDrive是高度构成性的,它能够从程序生成和真实数据输入中产生大量不同的驱动情景。在MetaDrive的基础上,我们在单一和多剂环境中都建立了各种RL的任务和基线,包括将各种通用性基准设定在看不见的场景、安全探索和学习多剂交通。在程序上产生的情景和实际世界情景上进行的一般性实验表明,增加培训的多样化和规模可以改善RL代理人的通用性。我们进一步评估各种安全加强学习和多剂强化性强化性地理代理商的动态数据。在MetaDretariual Dretariual 现有图表中提供了更多的图像/Dlimtarievab.