Existing deep reinforcement learning (DRL) based methods for solving the capacitated vehicle routing problem (CVRP) intrinsically cope with homogeneous vehicle fleet, in which the fleet is assumed as repetitions of a single vehicle. Hence, their key to construct a solution solely lies in the selection of the next node (customer) to visit excluding the selection of vehicle. However, vehicles in real-world scenarios are likely to be heterogeneous with different characteristics that affect their capacity (or travel speed), rendering existing DRL methods less effective. In this paper, we tackle heterogeneous CVRP (HCVRP), where vehicles are mainly characterized by different capacities. We consider both min-max and min-sum objectives for HCVRP, which aim to minimize the longest or total travel time of the vehicle(s) in the fleet. To solve those problems, we propose a DRL method based on the attention mechanism with a vehicle selection decoder accounting for the heterogeneous fleet constraint and a node selection decoder accounting for the route construction, which learns to construct a solution by automatically selecting both a vehicle and a node for this vehicle at each step. Experimental results based on randomly generated instances show that, with desirable generalization to various problem sizes, our method outperforms the state-of-the-art DRL method and most of the conventional heuristics, and also delivers competitive performance against the state-of-the-art heuristic method, i.e., SISR. Additionally, the results of extended experiments demonstrate that our method is also able to solve CVRPLib instances with satisfactory performance.
翻译:现有基于深层加固学习(DRL)的解决机动车辆路由问题的现有方法(CRP)与同一车队(CVRP)密不可分,其中车队被认为是单一车辆的重复,因此,其构建解决方案的关键完全在于选择下一个节点(客户)来访问,但不选择车辆。然而,现实世界中的车辆很可能是多种多样的,其特点不同,影响到车辆的容量(或旅行速度),使现有的DRL方法不那么有效。在本文中,我们处理混合的CVRP(CVRP)问题,其中车辆主要具有不同能力的特征。我们考虑HCVRP的最小和小和小总总目标,其目的是将车队中最长或总的旅行时间减少到最小。为了解决这些问题,我们建议基于关注机制的DRL方法,即车辆选择分解机群制约(或旅行速度),以及道路建设的节点选择分解器,我们学会通过自动选择车辆和车辆节点来构建解决方案,我们每个步骤的机动车辆的最小和小节点。根据常规方法,实验结果显示我们最优度的进度的进度。