Deep learning has brought great progress for the sequential recommendation (SR) tasks. With the structure of advanced residual networks, sequential recommender models can be stacked with many hidden layers, e.g., up to 100 layers on real-world SR datasets. Training such a deep network requires expensive computation and longer training time, especially in situations when there are tens of billions of user-item interactions. To deal with such a challenge, we present StackRec, a simple but very efficient training framework for deep SR models by layer stacking. Specifically, we first offer an important insight that residual layers/blocks in a well-trained deep SR model have similar distribution. Enlightened by this, we propose progressively stacking such pre-trained residual layers/blocks so as to yield a deeper but easier-to-train SR model. We validate the proposed StackRec by instantiating with two state-of-the-art SR models in three practical scenarios and real-world datasets. Extensive experiments show that StackRec achieves not only comparable performance, but also significant acceleration in training time, compared to SR models that are trained from scratch.
翻译:深层学习为顺序建议(SR)任务带来了巨大进展。随着先进的残余网络结构,序列建议模式可以堆叠成许多隐藏层,例如,在真实的SR数据集中高达100层。培训这样一个深层网络需要昂贵的计算和更长的培训时间,特别是在有数百亿用户-项目互动的情况下。为了应对这一挑战,我们向StackRec展示一个简单但非常高效的深层SR模型的层叠叠式培训框架。具体地说,我们首先提供了一个重要的洞察力,即受过良好训练的深层SR模型中的残余层/区块分布相似。我们为此建议逐步堆叠这些经过事先训练的残余层/区块,以便产生更深的、但更易于培训的SR模型。我们通过在三种实际情景和真实世界数据集中即时使用两种最先进的SR模型来验证拟议的StackRec。广泛的实验显示,StackRec不仅取得了可比的性能,而且与从头部受训的SR模型相比,在培训时间上也取得了显著的加速性。