With the continuous growth of the global economy and markets, resource imbalance has risen to be one of the central issues in real logistic scenarios. In marine transportation, this trade imbalance leads to Empty Container Repositioning (ECR) problems. Once the freight has been delivered from an exporting country to an importing one, the laden will turn into empty containers that need to be repositioned to satisfy new goods requests in exporting countries. In such problems, the performance that any cooperative repositioning policy can achieve strictly depends on the routes that vessels will follow (i.e., fleet deployment). Historically, Operation Research (OR) approaches were proposed to jointly optimize the repositioning policy along with the fleet of vessels. However, the stochasticity of future supply and demand of containers, together with black-box and non-linear constraints that are present within the environment, make these approaches unsuitable for these scenarios. In this paper, we introduce a novel framework, Configurable Semi-POMDPs, to model this type of problems. Furthermore, we provide a two-stage learning algorithm, "Configure & Conquer" (CC), that first configures the environment by finding an approximation of the optimal fleet deployment strategy, and then "conquers" it by learning an ECR policy in this tuned environmental setting. We validate our approach in large and real-world instances of the problem. Our experiments highlight that CC avoids the pitfalls of OR methods and that it is successful at optimizing both the ECR policy and the fleet of vessels, leading to superior performance in world trade environments.
翻译:随着全球经济和市场的持续增长,资源不平衡已成为真正的物流情景中的核心问题之一。在海运方面,这种贸易不平衡导致集装箱重新定位的空洞问题。一旦货物从出口国运到进口国,堆积物将变成空集装箱,为满足出口国对新货物的要求而需要重新定位。在这些问题中,任何合作性重新定位政策所能达到的业绩都严格取决于船舶将遵循的路线(即船队部署)。从历史上看,业务研究(OR)方法被提议与船舶船队一道优化重新定位政策。然而,未来集装箱供求的随机性,加上在环境中存在的黑箱和非线性限制,将使这些办法不适于满足出口国对新货物的要求。在本文中,我们引入了一个新颖的框架,即可配置的半POMDP,以模拟这类问题。此外,我们提供了一种两阶段的学习算法,即“CFC & Conquerque” (CC),它首先将环境与船舶船队的重新定位和需求优化环境,先是找到最佳的EBRA战略的精确度,然后我们从我们的最佳部署战略中找出最佳的精度。