In recent years, online ride-hailing platforms have become an indispensable part of urban transportation. After a passenger is matched up with a driver by the platform, both the passenger and the driver have the freedom to simply accept or cancel a ride with one click. Hence, accurately predicting whether a passenger-driver pair is a good match turns out to be crucial for ride-hailing platforms to devise instant order assignments. However, since the users of ride-hailing platforms consist of two parties, decision-making needs to simultaneously account for the dynamics from both the driver and the passenger sides. This makes it more challenging than traditional online advertising tasks. Moreover, the amount of available data is severely imbalanced across different cities, creating difficulties for training an accurate model for smaller cities with scarce data. Though a sophisticated neural network architecture can help improve the prediction accuracy under data scarcity, the overly complex design will impede the model's capacity of delivering timely predictions in a production environment. In the paper, to accurately predict the MSR of passenger-driver, we propose the Multi-View model (MV) which comprehensively learns the interactions among the dynamic features of the passenger, driver, trip order, as well as context. Regarding the data imbalance problem, we further design the Knowledge Distillation framework (KD) to supplement the model's predictive power for smaller cities using the knowledge from cities with denser data and also generate a simple model to support efficient deployment. Finally, we conduct extensive experiments on real-world datasets from several different cities, which demonstrates the superiority of our solution.
翻译:近年来,在线乘车平台已成为城市交通不可或缺的一部分。 在乘客与平台司机匹配后,乘客和司机可以自由接受或取消乘车,因此,准确预测乘车司机配对是否对搭乘平台设计即时订单任务至关重要。但是,由于乘车平台的用户由两方组成,决策需要同时考虑司机和乘客双方的动态。这比传统的在线广告任务更具挑战性。此外,不同城市的现有数据数量严重失衡,给以稀缺数据培训一个更精确的小型城市模型造成困难。因此,精确预测客车司机搭乘平台是否对搭乘平台设计即时订单任务至关重要。然而,由于乘车平台的用户由两方组成,因此,需要同时考虑司机和乘客双方的动态。我们建议多维模式(MV)比传统的在线广告任务更具挑战性。此外,现有数据数量在不同的城市中严重失衡,为缺乏数据的小城市培训了一个精确模型。尽管复杂的神经网络架构可以帮助提高数据在数据短缺情况下进行预测的准确性,但过于复杂的设计将阻碍模型在生产中提供不同的模型支持。 我们建议多维维基模型模型模型(MV)从几个城市之间的相互作用,我们使用更精确的模型, 数据将数据在使用更精确的模型和不断的数据的模型的模型中进一步的模型,我们的数据的模型的模型的模型, 将数据作为背景。