Delivery Time Estimation (DTE) is a crucial component of the e-commerce supply chain that predicts delivery time based on merchant information, sending address, receiving address, and payment time. Accurate DTE can boost platform revenue and reduce customer complaints and refunds. However, the imbalanced nature of industrial data impedes previous models from reaching satisfactory prediction performance. Although imbalanced regression methods can be applied to the DTE task, we experimentally find that they improve the prediction performance of low-shot data samples at the sacrifice of overall performance. To address the issue, we propose a novel Dual Graph Multitask framework for imbalanced Delivery Time Estimation (DGM-DTE). Our framework first classifies package delivery time as head and tail data. Then, a dual graph-based model is utilized to learn representations of the two categories of data. In particular, DGM-DTE re-weights the embedding of tail data by estimating its kernel density. We fuse two graph-based representations to capture both high- and low-shot data representations. Experiments on real-world Taobao logistics datasets demonstrate the superior performance of DGM-DTE compared to baselines.
翻译:执行时间估计(DTE)是电子商务供应链的重要组成部分,根据商业信息、发送地址、接收地址和支付时间预测交货时间。准确的DTE可以增加平台收入,减少客户投诉和退款。然而,工业数据的不平衡性质妨碍了以往模型达到令人满意的预测性能。虽然对DTE任务可以采用不平衡的回归方法,但我们实验发现,这些模型提高了低发数据样本的预测性能,牺牲总体性能。为了解决这个问题,我们提出了一个新的“双图表多任务”框架,用于不平衡的交付时间估计(DGM-DTE)。我们的框架首先将一揽子交付时间归类为头部和尾部数据。然后,利用一个双图表模型来了解两类数据的表现。特别是,DGM-DTE通过估计其内核密度来重新加权尾部数据的嵌入。我们结合了两个基于图表的表达方式,以捕捉高发和低发数据表示。对现实世界Taobao物流数据集的实验显示了相对于基准的优劣性表现。