利用深强化学习进行联合配配、定价和调度的分散式无模式共享模式 (A Distributed Model-Free Ride-Sharing Approach for Joint Matching, Pricing, and Dispatching using Deep Reinforcement Learning)

Significant development of ride-sharing services presents a plethora of opportunities to transform urban mobility by providing personalized and convenient transportation while ensuring efficiency of large-scale ride pooling. However, a core problem for such services is route planning for each driver to fulfill the dynamically arriving requests while satisfying given constraints. Current models are mostly limited to static routes with only two rides per vehicle (optimally) or three (with heuristics). In this paper, we present a dynamic, demand aware, and pricing-based vehicle-passenger matching and route planning framework that (1) dynamically generates optimal routes for each vehicle based on online demand, pricing associated with each ride, vehicle capacities and locations. This matching algorithm starts greedily and optimizes over time using an insertion operation, (2) involves drivers in the decision-making process by allowing them to propose a different price based on the expected reward for a particular ride as well as the destination locations for future rides, which is influenced by supply-and demand computed by the Deep Q-network, (3) allows customers to accept or reject rides based on their set of preferences with respect to pricing and delay windows, vehicle type and carpooling preferences, and (4) based on demand prediction, our approach re-balances idle vehicles by dispatching them to the areas of anticipated high demand using deep Reinforcement Learning (RL). Our framework is validated using the New York City Taxi public dataset; however, we consider different vehicle types and designed customer utility functions to validate the setup and study different settings. Experimental results show the effectiveness of our approach in real-time and large scale settings.

翻译：通过提供个性化和方便的交通,确保大型搭车集资的效率,大力发展搭车服务,为改造城市流动性提供了大量机会,提供个人化和方便的交通,从而确保大规模搭车的效率;然而,这些服务的一个核心问题是每个司机的路线规划,以便在满足既定限制的同时满足动态抵达的要求;目前模式大多限于固定路线,每部车辆只有两部(最优)或三部(超优)或三部(超优雅);在本文件中,我们提出了一个动态、有需求和价格意识的车辆客运比对和路线规划框架,这一框架:(1) 动态地为每部车辆创造最佳路线,同时确保大型搭车、与每部车、车辆能力和地点相关的定价和最佳路线;这种匹配的算法开始贪婪,在时间上优化,使用插入操作操作,让司机在决策过程中提出不同的价格,根据对每部车辆的预期奖励以及未来搭车的目的地提出不同的价格;我们根据深度Q-网络计算出的供需,(3) 允许顾客接受或拒绝搭车,基于对定价和延迟窗口、车辆类型和汽车的定价功能的定价功能、车辆类型和汽车保价功能;这种匹配的算法,但根据我们对车辆的汇率的预测,利用不断更新的预期的进度选择,根据我们对车辆的预期,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高要求,采用高的市路路路路路路路路路路路路。

相关内容

深度强化学习

关注 154

深度强化学习 (DRL) 是一种使用深度学习技术扩展传统强化学习方法的一种机器学习方法。传统强化学习方法的主要任务是使得主体根据从环境中获得的奖赏能够学习到最大化奖赏的行为。然而，传统无模型强化学习方法需要使用函数逼近技术使得主体能够学习出值函数或者策略。在这种情况下，深度学习强大的函数逼近能力自然成为了替代人工指定特征的最好手段并为性能更好的端到端学习的实现提供了可能。

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日