Multi-horizon probabilistic time series forecasting has wide applicability to real-world tasks such as demand forecasting. Recent work in neural time-series forecasting mainly focus on the use of Seq2Seq architectures. For example, MQTransformer - an improvement of MQCNN - has shown the state-of-the-art performance in probabilistic demand forecasting. In this paper, we consider incorporating cross-entity information to enhance model performance by adding a cross-entity attention mechanism along with a retrieval mechanism to select which entities to attend over. We demonstrate how our new neural architecture, MQRetNN, leverages the encoded contexts from a pretrained baseline model on the entire population to improve forecasting accuracy. Using MQCNN as the baseline model (due to computational constraints, we do not use MQTransformer), we first show on a small demand forecasting dataset that it is possible to achieve ~3% improvement in test loss by adding a cross-entity attention mechanism where each entity attends to all others in the population. We then evaluate the model with our proposed retrieval methods - as a means of approximating an attention over a large population - on a large-scale demand forecasting application with over 2 million products and observe ~1% performance gain over the MQCNN baseline.
翻译:多偏顺概率时间序列预测广泛适用于需求预测等现实世界任务。最近神经时间序列预测工作主要侧重于使用Seq2Seqeq结构。例如,MQTranserect(MQQCNN的改进)在概率需求预测中展示了最先进的性能。在本文件中,我们考虑纳入跨实体信息,通过增加跨实体关注机制以及选择哪些实体参与的检索机制来提高示范性业绩。我们展示了我们新的神经结构(MQRENNN)如何利用未经培训的全人口基准模型的编码环境来提高预测准确性。我们首先用MQTRN作为基线模型(由于计算限制,我们没有使用MQTranserexion),我们展示了少量需求预测数据集,通过增加跨实体参与所有其它实体参与的交叉关注机制,可以实现测试损失的大约3%的改善。我们随后用拟议的模型评估了我们提议的搜索模型,即利用预先培训的全人口的基线模型来提高预测准确性。我们首先利用MQCN作为基准模型(由于计算限制,我们没有使用)100万项产品进行大规模业绩预测,然后观测。