We describe a novel decision-making problem developed in response to the demands of retail electronic commerce (e-commerce). While working with logistics and retail industry business collaborators, we found that the cost of delivery of products from the most opportune node in the supply chain (a quantity called the cost-to-serve or CTS) is a key challenge. The large scale, high stochasticity, and large geographical spread of e-commerce supply chains make this setting ideal for a carefully designed data-driven decision-making algorithm. In this preliminary work, we focus on the specific subproblem of delivering multiple products in arbitrary quantities from any warehouse to multiple customers in each time period. We compare the relative performance and computational efficiency of several baselines, including heuristics and mixed-integer linear programming. We show that a reinforcement learning based algorithm is competitive with these policies, with the potential of efficient scale-up in the real world.
翻译:我们描述了响应零售电子商务(电子商务)需求而形成的新决策问题。 我们与物流和零售业商业合作者合作,发现供应链中最合适的节点(一个称为成本到服务或CTS的数量)的产品交付成本是一个关键挑战。 电子商务供应链的规模庞大、高度随机性和地域分布广泛,使得这一设置成为精心设计的数据驱动决策算法的理想。 在这项初步工作中,我们侧重于从任何仓库向多个客户任意提供多个产品的具体次问题。 我们比较了几个基线的相对性能和计算效率,包括超常和混合内插线性编程。 我们表明,基于强化学习算法与这些政策具有竞争力,在现实世界中有可能有效提升规模。