在高速公路编织区采用多代理机构深层强化学习的分权合作通道变化 (Decentralized Cooperative Lane Changing at Freeway Weaving Areas Using Multi-Agent Deep Reinforcement Learning)

Frequent lane changes during congestion at freeway bottlenecks such as merge and weaving areas further reduce roadway capacity. The emergence of deep reinforcement learning (RL) and connected and automated vehicle technology provides a possible solution to improve mobility and energy efficiency at freeway bottlenecks through cooperative lane changing. Deep RL is a collection of machine-learning methods that enables an agent to improve its performance by learning from the environment. In this study, a decentralized cooperative lane-changing controller was developed using proximal policy optimization by adopting a multi-agent deep RL paradigm. In the decentralized control strategy, policy learning and action reward are evaluated locally, with each agent (vehicle) getting access to global state information. Multi-agent deep RL requires lower computational resources and is more scalable than single-agent deep RL, making it a powerful tool for time-sensitive applications such as cooperative lane changing. The results of this study show that cooperative lane changing enabled by multi-agent deep RL yields superior performance to human drivers in term of traffic throughput, vehicle speed, number of stops per vehicle, vehicle fuel efficiency, and emissions. The trained RL policy is transferable and can be generalized to uncongested, moderately congested, and extremely congested traffic conditions.

翻译：在高速公路瓶颈(如合并和编织区)出现时,经常的车道变化,如合并和编织区等,进一步降低了道路能力; 深入强化学习(RL)以及连接和自动化车辆技术的出现,提供了一种可能的解决办法,通过合作更换车道来提高高速公路瓶颈的流动和能源效率; 深路路是一个机器学习方法的集合,使代理人能够从环境中学习来改进其绩效; 在这项研究中,采用多试剂深度路段优化模式,开发了一个分散式合作车道更换控制器,从而实现最适当的政策优化; 在分散式控制战略中,对政策学习和行动奖励进行当地评估,每个代理人(车辆)都能获得全球国家信息; 多剂深路段路段需要较低的计算资源,比单一剂深度路段更具有伸缩性,使其成为具有时间敏感性的应用(如合作改变车道)的有力工具; 此项研究的结果显示,多剂深度路段促成的合作车道变化,在交通通过量、车辆速度、车辆停留次数、车辆燃油效率和排放方面,使驾驶员具有优异性性性。受过训练的RL政策可转让,可以普及到最不固定的交通状况,可以普及至极低的交通状况。