Collaborative filtering (CF) is widely used to learn informative latent representations of users and items from observed interactions. Existing CF-based methods commonly adopt negative sampling to discriminate different items. Training with negative sampling on large datasets is computationally expensive. Further, negative items should be carefully sampled under the defined distribution, in order to avoid selecting an observed positive item in the training dataset. Unavoidably, some negative items sampled from the training dataset could be positive in the test set. In this paper, we propose a self-supervised collaborative filtering framework (SelfCF), that is specially designed for recommender scenario with implicit feedback. The proposed SelfCF framework simplifies the Siamese networks and can be easily applied to existing deep-learning based CF models, which we refer to as backbone networks. The main idea of SelfCF is to augment the output embeddings generated by backbone networks, because it is infeasible to augment raw input of user/item ids. We propose and study three output perturbation techniques that can be applied to different types of backbone networks including both traditional CF models and graph-based models. The framework enables learning informative representations of users and items without negative samples, and is agnostic to the encapsulated backbones. We conduct comprehensive experiments on four datasets to show that our framework may achieve even better recommendation accuracy than the encapsulated supervised counterpart with a 2$\times$--4$\times$ faster training speed. We also show that SelfCF can boost up the accuracy by up to 17.79\% on average, compared with a self-supervised framework BUIR.
翻译:合作过滤(CF) 被广泛用于学习用户和观察到的互动项目的信息性潜在潜在表现; 现有的CF方法通常采用负面抽样方法,对不同项目进行歧视; 大型数据集的负面抽样培训费用计算成本很高; 此外, 在定义的分布下,应该仔细抽样负面项目,以避免在培训数据集中选择观察到的积极项目; 不可避免的是, 从培训数据集中抽取的一些负面项目在测试集中可能是积极的。 在本文中,我们提议了一个自我监督的协作过滤框架(FoelCF),专门设计用于建议性假设,含隐含反馈; 拟议的SelfCF框架简化了Siames网络的负面抽样,并且可以很容易地应用于现有的基于主干网的深层学习模型, 我们称之为主干网的主要想法是增加主干网生成的输出, 因为它无法增加用户/ 项目的原始投入。 我们提议并研究三种更快的渗透性筛选技术,可以适用于不同类型的主干网网络,包括传统的CFM模型和图表型的精确度框架。 我们还可以在4个标准中进行信息化的自我分析, 我们的基底基质模型可以显示一个更精确的自我分析。