With the development of mobility-on-demand services, increasing sources of rich transportation data, and the advent of autonomous vehicles (AVs), there are significant opportunities for shared-use AV mobility services (SAMSs) to provide accessible and demand-responsive personal mobility. This paper focuses on the problem of anticipatory repositioning of idle vehicles in a SAMS fleet to enable better assignment decisions in serving future demand. The rebalancing problem is formulated as a Markov Decision Process and a reinforcement learning approach using an advantage actor critic (A2C) method is proposed to learn a rebalancing policy that anticipates future demand and cooperates with an optimization-based assignment strategy. The proposed formulation and solution approach allow for centralized repositioning decisions for the entire vehicle fleet but ensure that the problem size does not change with the size of the vehicle fleet. Using an agent-based simulation tool and New York City taxi data to simulate demand for rides in a SAMS system, two versions of the A2C AV repositioning approach are tested: A2C-AVR(A) observing past demand for rides and learning to anticipate future demand, and A2C-AVR(B) that receives demand forecasts. Numerical experiments demonstrate that the A2C-AVR approaches significantly reduce mean passenger wait times relative to an alternative optimization-based rebalancing approach, at the expense of slightly increased percentage of empty fleet miles travelled. The experiments show comparable performance between the A2C-AVR(A) and (B), indicating that the approach can anticipate future demand based on past demand observations. Testing with various demand and time-of-day scenarios, and an alternative assignment strategy, experiments demonstrate the models transferability to cases unseen at the training stage.
翻译:随着需求流动服务的开发、丰富的运输数据来源的增多以及自主车辆(A2C)的出现,共享使用AV流动服务(SAMS)的机会很大,可以提供无障碍的、符合需求的个人流动,本文件侧重于在SAMS机队中对闲置车辆进行预期性重新定位的问题,以便能够在满足未来需求时作出更好的派任决定。再平衡问题被作为Markov决策程序和利用优势行为者评论家(A2C)的强化学习方法提出,以学习预测未来需求和配合基于优化的派任战略的重新平衡政策。拟议的制定和解决方案方法允许集中调整整个车队的派任决策,但确保问题的规模不会随着车队规模的变化而改变。使用一种基于代理的模拟工具和纽约市出租车数据,以模拟对未来车的需求,测试A2CAV重新定位方法的两种版本:A2C-AVR(A)基于观察对乘坐车辆和学习预测未来需求的替代需求,A2R-AVR(B)级预测,在A2号机程阶段显示对未来需求进行比较性调整方法的相对性要求增加。AVAVA-A时间的实验显示对A-A值的预测,在AV-R-R-A值调整中显示对未来需求的预测中显示对未来需求的预测性成本的预测,显示对成本的相对性成本成本的预测,显示对成本的预测,显示对成本对成本的预测,对成本对成本的预测性要求的预测,对成本的预测,显示对成本对成本的预测,对成本的预测性对成本对成本的预测性对成本的预测,对成本的预测,对成本的预测显示对成本对成本的预测对成本的预测对成本的预测对成本的预测显示对成本的预测显示对成本的预测显示对成本的预测。