Drones have been considered as an alternative means of package delivery to reduce the delivery cost and time. Due to the battery limitations, the drones are best suited for last-mile delivery, i.e., the delivery from the package distribution centers (PDCs) to the customers. Since a typical delivery system consists of multiple PDCs, each having random and time-varying demands, the dynamic drone-to-PDC allocation would be of great importance in meeting the demand in an efficient manner. In this paper, we study the dynamic UAV assignment problem for a drone delivery system with the goal of providing measurable Quality of Service (QoS) guarantees. We adopt a queueing theoretic approach to model the customer-service nature of the problem. Furthermore, we take a deep reinforcement learning approach to obtain a dynamic policy for the re-allocation of the UAVs. This policy guarantees a probabilistic upper-bound on the queue length of the packages waiting in each PDC, which is beneficial from both the service provider's and the customers' viewpoints. We evaluate the performance of our proposed algorithm by considering three broad arrival classes, including Bernoulli, Time-Varying Bernoulli, and Markov-Modulated Bernoulli arrivals. Our results show that the proposed method outperforms the baselines, particularly in scenarios with Time-Varying and Markov-Modulated Bernoulli arrivals, which are more representative of real-world demand patterns. Moreover, our algorithm satisfies the QoS constraints in all the studied scenarios while minimizing the average number of UAVs in use.
翻译:由于电池有限,无人驾驶飞机最适合最后一英里交货,即从包装分发中心向客户交货。由于典型的交付系统由多个PDC组成,每个PDC都有随机和时间变化的需求,动态无人驾驶飞机到DDC的分配对于有效满足需求非常重要。在本文中,我们研究了无人驾驶飞机交付系统动态的UAV任务分配问题,目的是提供可衡量的服务质量保障。我们采取了轮装限制方法,以模拟问题的客户服务性质。此外,我们采取了深度强化学习方法,以获得动态政策,重新配置UAVs,这种政策保证每个PDC的软件排队排长都具有较高的稳定性,这有利于服务供应商和客户的观点。我们评估了我们拟议算法的运行情况,考虑三次广泛的抵达类别,特别是伯尔尼-摩利标准,在Bern-Slorlor-Slor-Slormation中,在Bern-lor-loral-loral-Sloral-loral-loral-loral-loral-loral-leval-lations,在Beral-lor-lor-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-lation-l)中,在Bxxxxxxxxxxxxxxxxxxxxxxxxx,在Bxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,在显示中显示中显示中显示,在时间-l-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,在B)中显示,在标中显示,在标和B和B)和Bx,在标算,在标和B-l-l-l-l-l-l-l-l