We propose a model-free reinforcement learning method for controlling mixed autonomy traffic in simulated traffic networks with through-traffic-only two-way and four-way intersections. Our method utilizes multi-agent policy decomposition which allows decentralized control based on local observations for an arbitrary number of controlled vehicles. We demonstrate that, even without reward shaping, reinforcement learning learns to coordinate the vehicles to exhibit traffic signal-like behaviors, achieving near-optimal throughput with 33-50% controlled vehicles. With the help of multi-task learning and transfer learning, we show that this behavior generalizes across inflow rates and size of the traffic network. Our code, models, and videos of results are available at https://github.com/ZhongxiaYan/mixed_autonomy_intersections.
翻译:我们提出一个无示范强化学习方法,以控制模拟交通网络的混合自主交通,这些网络有单向交通双向和四向交叉路口。我们的方法采用多剂政策分解,允许根据当地观察对任意数量的受控车辆进行分散控制。我们证明,即使不进行奖励制成,强化学习学会协调车辆展示交通信号相似的行为,以33-50%的受控车辆实现接近最佳的吞吐。在多任务学习和转移学习的帮助下,我们展示了这一行为在交通网络的流入率和规模上具有通用性。我们的代码、模型和结果视频可以在https://github.com/ ZhongxiaYan/mixed_aututonophy_intersections上查阅。