利用多目标强化学习,机动机动双车调度无船船的公自行车共享系统 (Dynamic Bicycle Dispatching of Dockless Public Bicycle-sharing Systems using Multi-objective Reinforcement Learning)

As a new generation of Public Bicycle-sharing Systems (PBS), the dockless PBS (DL-PBS) is an important application of cyber-physical systems and intelligent transportation. How to use AI to provide efficient bicycle dispatching solutions based on dynamic bicycle rental demand is an essential issue for DL-PBS. In this paper, we propose a dynamic bicycle dispatching algorithm based on multi-objective reinforcement learning (MORL-BD) to provide the optimal bicycle dispatching solution for DL-PBS. We model the DL-PBS system from the perspective of CPS and use deep learning to predict the layout of bicycle parking spots and the dynamic demand of bicycle dispatching. We define the multi-route bicycle dispatching problem as a multi-objective optimization problem by considering the optimization objectives of dispatching costs, dispatch truck's initial load, workload balance among the trucks, and the dynamic balance of bicycle supply and demand. On this basis, the collaborative multi-route bicycle dispatching problem among multiple dispatch trucks is modeled as a multi-agent MORL model. All dispatch paths between parking spots are defined as state spaces, and the reciprocal of dispatching costs is defined as a reward. Each dispatch truck is equipped with an agent to learn the optimal dispatch path in the dynamic DL-PBS network. We create an elite list to store the Pareto optimal solutions of bicycle dispatch paths found in each action, and finally, get the Pareto frontier. Experimental results on the actual DL-PBS systems show that compared with existing methods, MORL-BD can find a higher quality Pareto frontier with less execution time.

翻译：作为新一代的公共自行车共享系统(PBS)的新一代,没有码头的PBS系统(DL-PBS)是网络物理系统和智能运输的一个重要应用。如何使用AI提供基于动态自行车租赁需求的高效自行车发送解决方案是DL-PBS的一个重要问题。在本文中,我们提议基于多目标强化学习(MORL-BD)的动态自行车发送算法,为DL-PBS提供最佳自行车发送解决方案。我们从CPS的角度来模拟DL-PBS系统,并使用深层次的学习来预测自行车泊车停放点的布局和自行车发送的动态需求。我们把多路自行车发送问题定义为一个多目标的优化问题。我们通过考虑最优化的发送成本、运送卡车的最初负荷、卡车之间的工作量平衡以及自行车供需的动态平衡。在此基础上,多路运卡车之间的合作性多路路发送问题以多试的MORL模式为模型。所有泊车站之间的发送路径都被定义为州间车位的车位和机动车发送的动态路路段,而每个PA-BS级的发送成本则以最佳的升级的升级的行运成本显示。我们找到的行车的目的地的目的地的目的地的目的地,我们找到的行运的目的地的目的地的目的地的目的地的车价,我们可以确定为最佳的目的地的目的地的目的地的目的地的目的地的目的地的目的地的车路路路路标。