BlossomRec：面向序列推荐的块级融合稀疏注意力机制 (BlossomRec: Block-level Fused Sparse Attention Mechanism for Sequential Recommendations)

Transformer structures have been widely used in sequential recommender systems (SRS). However, as user interaction histories increase, computational time and memory requirements also grow. This is mainly caused by the standard attention mechanism. Although there exist many methods employing efficient attention and SSM-based models, these approaches struggle to effectively model long sequences and may exhibit unstable performance on short sequences. To address these challenges, we design a sparse attention mechanism, BlossomRec, which models both long-term and short-term user interests through attention computation to achieve stable performance across sequences of varying lengths. Specifically, we categorize user interests in recommendation systems into long-term and short-term interests, and compute them using two distinct sparse attention patterns, with the results combined through a learnable gated output. Theoretically, it significantly reduces the number of interactions participating in attention computation. Extensive experiments on four public datasets demonstrate that BlossomRec, when integrated with state-of-the-art Transformer-based models, achieves comparable or even superior performance while significantly reducing memory usage, providing strong evidence of BlossomRec's efficiency and effectiveness.The code is available at https://github.com/ronineume/BlossomRec.

翻译：Transformer结构在序列推荐系统中已得到广泛应用。然而，随着用户交互历史的增长，计算时间和内存需求也随之增加，这主要由标准注意力机制引起。尽管已有多种方法采用高效注意力及基于状态空间模型的架构，但这些方法难以有效建模长序列，且在短序列上可能表现不稳定。为应对这些挑战，我们设计了一种稀疏注意力机制BlossomRec，通过注意力计算同时建模用户的长期与短期兴趣，从而在不同长度的序列上实现稳定性能。具体而言，我们将推荐系统中的用户兴趣划分为长期兴趣与短期兴趣，并采用两种不同的稀疏注意力模式分别进行计算，结果通过可学习的门控输出进行融合。理论上，该方法显著减少了参与注意力计算的交互数量。在四个公开数据集上的大量实验表明，BlossomRec与基于Transformer的先进模型结合时，在显著降低内存占用的同时，取得了相当甚至更优的性能，为BlossomRec的效率和有效性提供了有力证据。代码发布于https://github.com/ronineume/BlossomRec。