Sequential recommendation models the dynamics of a user's previous behaviors in order to forecast the next item, and has drawn a lot of attention. Transformer-based approaches, which embed items as vectors and use dot-product self-attention to measure the relationship between items, demonstrate superior capabilities among existing sequential methods. However, users' real-world sequential behaviors are \textit{\textbf{uncertain}} rather than deterministic, posing a significant challenge to present techniques. We further suggest that dot-product-based approaches cannot fully capture \textit{\textbf{collaborative transitivity}}, which can be derived in item-item transitions inside sequences and is beneficial for cold start items. We further argue that BPR loss has no constraint on positive and sampled negative items, which misleads the optimization. We propose a novel \textbf{STO}chastic \textbf{S}elf-\textbf{A}ttention~(STOSA) to overcome these issues. STOSA, in particular, embeds each item as a stochastic Gaussian distribution, the covariance of which encodes the uncertainty. We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences, which effectively incorporates uncertainty into model training. Wasserstein attentions also enlighten the collaborative transitivity learning as it satisfies triangle inequality. Moreover, we introduce a novel regularization term to the ranking loss, which assures the dissimilarity between positive and the negative items. Extensive experiments on five real-world benchmark datasets demonstrate the superiority of the proposed model over state-of-the-art baselines, especially on cold start items. The code is available in \url{https://github.com/zfan20/STOSA}.
翻译:序列建议模拟用户先前的行为动态, 以便预测下一个项目, 并引起很多注意 。 基于 变换器的方法, 将项目嵌入向量矢量, 并使用点产品自我注意来测量项目之间的关系, 显示了现有顺序方法之间的超强能力 。 然而, 用户真实世界的顺序行为是\ textitut thextbf{ uncertain} 而不是确定性, 对当前技术构成重大挑战 。 我们进一步建议, 基于 dot 产品的方法不能完全捕捉 Textit rtbf{ collobreating过渡性 。 特别是, 将每个项目嵌入项目向项目转换为正度向量 。 我们提议, 将正值 STOSA 的正值正向值, 将每个项目引入了正值的向量位 。 将正值的正值的正值的向值 。 正在变式的向模型将每个项目引入了正向值, 将正值的正值的正值 。