The rapid growth of ride-hailing platforms has created a highly competitive market where businesses struggle to make profits, demanding the need for better operational strategies. However, real-world experiments are risky and expensive for these platforms as they deal with millions of users daily. Thus, a need arises for a simulated environment where they can predict users' reactions to changes in the platform-specific parameters such as trip fares and incentives. Building such a simulation is challenging, as these platforms exist within dynamic environments where thousands of users regularly interact with one another. This paper presents a framework to mimic and predict user, specifically driver, behaviors in ride-hailing services. We use a data-driven hybrid reinforcement learning and imitation learning approach for this. First, the agent utilizes behavioral cloning to mimic driver behavior using a real-world data set. Next, reinforcement learning is applied on top of the pre-trained agents in a simulated environment, to allow them to adapt to changes in the platform. Our framework provides an ideal playground for ride-hailing platforms to experiment with platform-specific parameters to predict drivers' behavioral patterns.
翻译:乘车平台的快速增长创造了一个高度竞争的市场,企业在其中为盈利而挣扎,要求制定更好的操作战略。然而,现实世界实验对于这些平台每天与数百万用户打交道,风险很大而且费用昂贵。 因此,需要模拟环境,让用户能够预测用户对平台特定参数变化的反应,如旅行票价和奖励等。 建立这样的模拟具有挑战性,因为这些平台存在于有数千个用户经常互动的动态环境中。 本文为模拟和预测用户、 特别是驾驶员在乘车服务中的行为提供了一个框架。 我们为此使用了数据驱动的混合强化学习和模仿学习方法。 首先, 代理使用行为克隆来模拟驱动者行为, 使用真实世界数据集。 下一步, 强化学习在模拟环境中先训练过的代理器之上应用, 以适应平台的变化。 我们的框架为搭乘平台提供了一个理想的游乐场, 以实验平台特定参数来预测司机的行为模式。