As the industry of autonomous driving grows, so does the potential interaction of groups of autonomous cars. Combined with the advancement of Artificial Intelligence and simulation, such groups can be simulated, and safety-critical models can be learned controlling the cars within. This study applies reinforcement learning to the problem of multi-agent car parking, where groups of cars aim to efficiently park themselves, while remaining safe and rational. Utilising robust tools and machine learning frameworks, we design and implement a flexible car parking environment in the form of a Markov decision process with independent learners, exploiting multi-agent communication. We implement a suite of tools to perform experiments at scale, obtaining models parking up to 7 cars with over a 98.1% success rate, significantly beating existing single-agent models. We also obtain several results relating to competitive and collaborative behaviours exhibited by the cars in our environment, with varying densities and levels of communication. Notably, we discover a form of collaboration that cannot arise without competition, and a 'leaky' form of collaboration whereby agents collaborate without sufficient state. Such work has numerous potential applications in the autonomous driving and fleet management industries, and provides several useful techniques and benchmarks for the application of reinforcement learning to multi-agent car parking.
翻译:随着自主驾驶行业的发展,自治汽车团体的潜在互动也随之增加。在推进人工智能和模拟的同时,可以模拟这类团体,并学习安全临界模型来控制内部汽车。本研究将强化学习应用于多试车停放问题,让汽车团体在保持安全和合理的情况下高效停车。利用强大的工具和机器学习框架,我们设计和实施灵活的汽车停放环境,其形式是马可夫与独立学生的决策过程,利用多剂通信。我们实施了一系列工具来进行规模实验,获得7辆车的停车模式,成功率超过98.1%,大大打击现有的单一剂模式。我们还取得了若干成果,涉及汽车在我们环境中以不同密度和通信水平的方式展示的竞争和合作行为。值得注意的是,我们发现了一种不能在没有竞争的情况下产生的协作形式,以及一种“低效”形式的协作,即代理人在没有足够状态的情况下进行合作。这种工作在自主驾驶和车队管理行业中有许多潜在应用,并为应用强化停车加固技术以学习多剂汽车提供了若干有用的技术和基准。