Recent technology development brings the booming of numerous new Demand-Driven Services (DDS) into urban lives, including ridesharing, on-demand delivery, express systems and warehousing. In DDS, a service loop is an elemental structure, including its service worker, the service providers and corresponding service targets. The service workers should transport either humans or parcels from the providers to the target locations. Various planning tasks within DDS can thus be classified into two individual stages: 1) Dispatching, which is to form service loops from demand/supply distributions, and 2)Routing, which is to decide specific serving orders within the constructed loops. Generating high-quality strategies in both stages is important to develop DDS but faces several challenging. Meanwhile, deep reinforcement learning (DRL) has been developed rapidly in recent years. It is a powerful tool to solve these problems since DRL can learn a parametric model without relying on too many problem-based assumptions and optimize long-term effect by learning sequential decisions. In this survey, we first define DDS, then highlight common applications and important decision/control problems within. For each problem, we comprehensively introduce the existing DRL solutions, and further summarize them in \textit{https://github.com/tsinghua-fib-lab/DDS\_Survey}. We also introduce open simulation environments for development and evaluation of DDS applications. Finally, we analyze remaining challenges and discuss further research opportunities in DRL solutions for DDS.
翻译:最近的技术发展使许多新的需求驱动服务(DDS)蓬勃发展到城市生活,包括搭乘、按需提供、快递系统和仓储。在DDS中,服务环是一个元素结构,包括服务工人、服务提供者和相应的服务目标。服务工人应当将人或包裹从提供者运送到目标地点。DDDS中的各种规划任务因此可以分为两个阶段:(1) 调度,即从需求/供应分配中形成服务循环,和(2) 运行,即在已建循环中决定具体服务订单。在两个阶段形成高质量的战略对于开发DDS十分重要,但面临若干挑战。与此同时,近年来迅速发展了深入的强化学习(DRL),这是解决这些问题的有力工具,因为DDL可以不依赖过多基于问题的假设,而通过学习开放性决定优化长期影响。在本次调查中,我们首先定义DDDDDS,然后强调共同应用和重要的决定/控制问题。关于每个问题,我们还在DDRDS/DRDRA中全面介绍现有的研究机会和DRDRDR应用。