The target of reducing travel time only is insufficient to support the development of future smart transportation systems. To align with the United Nations Sustainable Development Goals (UN-SDG), a further reduction of fuel and emissions, improvements of traffic safety, and the ease of infrastructure deployment and maintenance should also be considered. Different from existing work focusing on the optimization of the control in either traffic light signal (to improve the intersection throughput), or vehicle speed (to stabilize the traffic), this paper presents a multi-agent deep reinforcement learning (DRL) system called CoTV, which Cooperatively controls both Traffic light signals and connected autonomous Vehicles (CAV). Therefore, our CoTV can well balance the achievement of the reduction of travel time, fuel, and emission. In the meantime, CoTV can also be easy to deploy by cooperating with only one CAV that is the nearest to the traffic light controller on each incoming road. This enables more efficient coordination between traffic light controllers and CAV, thus leading to the convergence of training CoTV under the large-scale multi-agent scenario that is traditionally difficult to converge. We give the detailed system design of CoTV, and demonstrate its effectiveness in a simulation study using SUMO under various grid maps and realistic urban scenarios with mixed-autonomy traffic.
翻译:为了与联合国可持续发展目标(UN-SDG)保持一致,还应考虑进一步减少燃料和排放、改善交通安全以及便利基础设施的部署和维护。与目前侧重于优化对交通灯信号(以改进交汇路面)或车辆速度(稳定交通)的控制的工作不同,本文件提出了一个多试剂深度强化学习系统,称为COTV,它由合作控制交通灯信号和连接的自主车辆(CAV)。因此,我们的COTV可以很好地平衡缩短旅行时间、燃料和排放的实现。与此同时,CTV还可以通过与距离每条行进公路交通灯控制器最近的CAV合作,方便地部署CVAV。这使得交通灯控制器和CAVAV之间能够更有效地协调,从而使得培训COTV在传统上难以集中的大型多试办假设下趋于一致。我们提供了CTV的详细系统设计,并展示了它与各种电网图和现实型城市地图下混合型地图的模拟流量研究的有效性。