Modern deep neural networks (DNNs) have greatly facilitated the development of sequential recommender systems by achieving state-of-the-art recommendation performance on various sequential recommendation tasks. Given a sequence of interacted items, existing DNN-based sequential recommenders commonly embed each item into a unique vector to support subsequent computations of the user interest. However, due to the potentially large number of items, the over-parameterised item embedding matrix of a sequential recommender has become a memory bottleneck for efficient deployment in resource-constrained environments, e.g., smartphones and other edge devices. Furthermore, we observe that the widely-used multi-head self-attention, though being effective in modelling sequential dependencies among items, heavily relies on redundant attention units to fully capture both global and local item-item transition patterns within a sequence. In this paper, we introduce a novel lightweight self-attentive network (LSAN) for sequential recommendation. To aggressively compress the original embedding matrix, LSAN leverages the notion of compositional embeddings, where each item embedding is composed by merging a group of selected base embedding vectors derived from substantially smaller embedding matrices. Meanwhile, to account for the intrinsic dynamics of each item, we further propose a temporal context-aware embedding composition scheme. Besides, we develop an innovative twin-attention network that alleviates the redundancy of the traditional multi-head self-attention while retaining full capacity for capturing long- and short-term (i.e., global and local) item dependencies. Comprehensive experiments demonstrate that LSAN significantly advances the accuracy and memory efficiency of existing sequential recommenders.
翻译:现代深心神经网络(DNNS)通过在一系列相继建议任务上实现最先进的建议性业绩,极大地促进了顺序建议系统的开发。鉴于一系列互动项目,现有的基于DNNN的顺序建议者通常将每个项目嵌入一个独特的矢量中,以支持随后对用户兴趣的计算。然而,由于潜在的项目数量庞大,一个相继建议者超分化的嵌入矩阵已成为在资源限制环境中有效部署的记忆瓶颈,例如,智能手机和其他边缘装置。此外,我们注意到,广泛使用的多头自控,尽管在模拟各项目之间的连续依赖性方面是有效的,但严重依赖冗余关注单位将每个项目嵌入一个独特的矢量,以充分捕全球和地方项目。在本文中,我们引入了一个新的轻度自惯性自惯性网络(LSAN),以顺序建议。要大力压缩原始嵌入的短期矩阵,LSAN的软性嵌入概念,每个项目的嵌入都是通过将一个选择的基底基级的直径直径直线网络组合,同时将每个内嵌入的内嵌入式软的内嵌入系统结构,我们将每个软基内置的内置的内置系统,我们的内置的内置的内置系统结构结构,我们从一个基内嵌入的内嵌入的内嵌入的内嵌入的内嵌入的内嵌入系统结构,我们更小。