Sequential recommendation (SR) aims to model users' dynamic preferences from their historical interactions. Recently, Transformer and convolution neural network (CNNs) have shown great success in learning representations for SR. Nevertheless, Transformer mainly focus on capturing content-based global interactions, while CNNs effectively exploit local features in practical recommendation scenarios. Thus, how to effectively aggregate CNNs and Transformer to model both local and global dependencies of historical item sequence still remains an open challenge and is rarely studied in SR. To this regard, we inject locality inductive bias into Transformer by combining its global attention mechanism with a local convolutional filter, and adaptively determine the mixing importance on a personalized basis through a module- and layer-aware adaptive mixture units, named AdaMCT. Moreover, considering that softmax-based attention may encourage unimodal activation, we introduce the Squeeze-Excitation Attention (with sigmoid activation) into sequential recommendation to capture multiple relevant items (keys) simultaneously. Extensive experiments on three widely used benchmark datasets demonstrate that AdaMCT significantly outperforms the previous Transformer and CNNs based models by an average of 18.46% and 60.85% respectively in terms of NDCG@5 and achieves state-of-the-art performance.
翻译:顺序建议(SR)旨在从历史互动中模拟用户的动态偏好。最近,变异器和进化神经网络(CNNs)在斯洛伐克的学习表现中表现出极大的成功。然而,变异器主要侧重于捕捉基于内容的全球互动,而CNN则在切实可行的建议情景中有效地利用当地特点。因此,如何有效地将CNN和变异器聚合成当地和全球的历史物品序列依赖性模型,这仍然是一个开放的挑战,斯洛伐克共和国很少研究。在这方面,我们通过将其全球关注机制与地方革命过滤器相结合,向变异器注入诱导偏向性,并通过一个称为AdaMCT的模块和分层适应性适应性混合混合组合单元,在个人化的基础上确定混合重要性。此外,考虑到软式关注可能会鼓励单式激活,我们引入了“电磁感应(与模拟激活)”顺序建议,以同时捕捉多种相关物项(钥匙)。关于三大基准数据集的广泛实验表明,AdaMCT公司与当地革命过滤器相结合,并通过名为AdaMCT的模块和G18号的州标准分别在N85和州标准中分别达到18 %的NMS-G标准的平均值。