可扩展的流式推荐动态嵌入大小搜索方法 (Scalable Dynamic Embedding Size Search for Streaming Recommendation)

Recommender systems typically represent users and items by learning their embeddings, which are usually set to uniform dimensions and dominate the model parameters. However, real-world recommender systems often operate in streaming recommendation scenarios, where the number of users and items continues to grow, leading to substantial storage resource consumption for these embeddings. Although a few methods attempt to mitigate this by employing embedding size search strategies to assign different embedding dimensions in streaming recommendations, they assume that the embedding size grows with the frequency of users/items, which eventually still exceeds the predefined memory budget over time. To address this issue, this paper proposes to learn Scalable Lightweight Embeddings for streaming recommendation, called SCALL, which can adaptively adjust the embedding sizes of users/items within a given memory budget over time. Specifically, we propose to sample embedding sizes from a probabilistic distribution, with the guarantee to meet any predefined memory budget. By fixing the memory budget, the proposed embedding size sampling strategy can increase and decrease the embedding sizes in accordance to the frequency of the corresponding users or items. Furthermore, we develop a reinforcement learning-based search paradigm that models each state with mean pooling to keep the length of the state vectors fixed, invariant to the changing number of users and items. As a result, the proposed method can provide embedding sizes to unseen users and items. Comprehensive empirical evaluations on two public datasets affirm the advantageous effectiveness of our proposed method.

翻译：推荐系统通常通过学习用户和物品的嵌入向量来表示它们，这些嵌入向量通常被设置为统一维度并主导模型参数。然而，现实中的推荐系统往往在流式推荐场景下运行，其中用户和物品的数量持续增长，导致这些嵌入向量消耗大量存储资源。尽管已有少数方法尝试通过采用嵌入大小搜索策略为流式推荐分配不同的嵌入维度来缓解这一问题，但它们假设嵌入大小随用户/物品的出现频率增长，这最终仍会随时间推移超出预定义的内存预算。为解决此问题，本文提出一种用于流式推荐的可扩展轻量级嵌入学习方法，称为SCALL，该方法能够随时间推移在给定内存预算内自适应调整用户/物品的嵌入大小。具体而言，我们提出从概率分布中采样嵌入大小，并保证满足任何预定义的内存预算。通过固定内存预算，所提出的嵌入大小采样策略可以根据相应用户或物品的出现频率增加或减少嵌入大小。此外，我们开发了一种基于强化学习的搜索范式，该范式通过均值池化对每个状态进行建模，以保持状态向量的长度固定，不随用户和物品数量的变化而改变。因此，所提出的方法能够为未见过的用户和物品提供嵌入大小。在两个公开数据集上的全面实证评估证实了我们所提方法的优越有效性。