A platoon refers to a group of vehicles traveling together in very close proximity. It has received significant attention from the autonomous vehicle research community due to its strong potential to significantly enhance fuel efficiency, driving safety, and driver comfort. Despite these advantages, recent research has revealed a detrimental effect of the extremely small intra-platoon gap on traffic flow for highway on-ramp merging. While existing control-based methods allow for adaptation of the intra-platoon gap to improve traffic flow, making an optimal control decision under the complex dynamics of traffic conditions remains a significant challenge due to the massive computational complexity. To this end, we present the design, implementation, and evaluation of a novel reinforcement learning framework that adaptively adjusts the intra-platoon gap of an individual platoon member to maximize traffic flow in response to dynamically changing, complex traffic conditions for highway on-ramp merging. The state space of the framework is carefully designed in consultation with the transportation literature to incorporate critical traffic parameters relevant to merging efficiency. A deep deterministic policy gradient algorithm is adopted to account for the continuous action space to ensure precise and continuous adjustment of the intra-platoon gap. An extensive simulation study demonstrates the effectiveness of the reinforcement learning-based approach for significantly improving traffic flow in various highway merging scenarios.
翻译:一个排是指一组在非常接近的情况下一起旅行的车辆,它由于具有大大提高燃料效率、驾驶安全和司机舒适度的巨大潜力而得到了自主车辆研究界的极大关注。尽管有这些优势,最近的研究揭示了地表内交通流量极小对公路上车轮合并的交通流量的有害影响。现有的基于控制的方法允许调整地块内交通流量,以改善交通流量,在交通条件的复杂动态下作出最佳控制决定,由于计算的复杂性,这仍然是一个重大挑战。为此,我们介绍了一个新的强化学习框架的设计、实施和评价,该框架根据动态变化和复杂交通条件,调整了排内人员在平板内的差距,以最大限度地扩大交通流量。框架的状态是在与运输文献协商的情况下精心设计的,以纳入与整合效率相关的关键交通参数。采用了一种深层的确定性政策梯度算法,以考虑持续的行动空间,以确保准确和持续调整地段内交通缺口。一项广泛的模拟研究展示了各种交通流量的整合方式。