Emerging vehicular systems with increasing proportions of automated components present opportunities for optimal control to mitigate congestion and increase efficiency. There has been a recent interest in applying deep reinforcement learning (DRL) to these nonlinear dynamical systems for the automatic design of effective control strategies. Despite conceptual advantages of DRL being model-free, studies typically nonetheless rely on training setups that are painstakingly specialized to specific vehicular systems. This is a key challenge to efficient analysis of diverse vehicular and mobility systems. To this end, this article contributes a streamlined methodology for vehicular microsimulation and discovers high performance control strategies with minimal manual design. A variable-agent, multi-task approach is presented for optimization of vehicular Partially Observed Markov Decision Processes. The methodology is experimentally validated on mixed autonomy traffic systems, where fractions of vehicles are automated; empirical improvement, typically 15-60% over a human driving baseline, is observed in all configurations of six diverse open or closed traffic systems. The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering. Finally, the emergent behaviors are analyzed to produce interpretable control strategies, which are validated against the learned control strategies.
翻译:新兴的车辆系统,其自动组件比例日益增加,为最佳控制以减轻拥堵和提高效率提供了机会。最近人们有兴趣对这些非线性动态系统应用深度强化学习(DRL),以便自动设计有效的控制战略。尽管DRL没有模型的概念优势,但研究通常仍依赖于对具体车辆系统进行艰苦专门化的培训设置,这是有效分析各种车辆和移动系统的关键挑战。为此,本条款为车辆微缩模拟提供了简化的方法,并发现了以最低限度手工设计为目的的高性能控制战略。为优化部分观测到的车辆决定程序,提出了可变的多任务方法。该方法在混合自主交通系统上进行了实验性验证,其中车辆部分是自动化的;在所有六种不同开放或封闭的交通系统配置中都观察到了经验改进,通常超过人类驾驶基线的15-60%。该研究揭示了许多与降低波浪、交通信号和坡坡座测量有关的新兴行为。最后,该方法被验证了对可解释性控制的战略。