Contrastive learning (CL) has recently been demonstrated critical in improving recommendation performance. The underlying principle of CL-based recommendation models is to ensure the consistency between representations derived from different graph augmentations of the user-item bipartite graph. This self-supervised approach allows for the extraction of general features from raw data, thereby mitigating the issue of data sparsity. Despite the effectiveness of this paradigm, the factors contributing to its performance gains have yet to be fully understood. This paper provides novel insights into the impact of CL on recommendation. Our findings indicate that CL enables the model to learn more evenly distributed user and item representations, which alleviates the prevalent popularity bias and promoting long-tail items. Our analysis also suggests that the graph augmentations, previously considered essential, are relatively unreliable and of limited significance in CL-based recommendation. Based on these findings, we put forward an eXtremely Simple Graph Contrastive Learning method (XSimGCL) for recommendation, which discards the ineffective graph augmentations and instead employs a simple yet effective noise-based embedding augmentation to generate views for CL. A comprehensive experimental study on four large and highly sparse benchmark datasets demonstrates that, though the proposed method is extremely simple, it can smoothly adjust the uniformity of learned representations and outperforms its graph augmentation-based counterparts by a large margin in both recommendation accuracy and training efficiency. The code and used datasets are released at https://github.com/Coder-Yu/SELFRec.
翻译:对比学习技术(CL)已被证明对提高推荐系统效果具有重大作用。基于CL的推荐模型的基本原理是确保从用户-物品二部图的不同图形增量中导出的表示之间的一致性。这种自我监督的方法可以从原始数据中提取通用特征,从而缓解数据稀疏性的问题。尽管这种模式的有效性,其性能提升的机制尚未完全被理解。本文提供了对CL对推荐的影响的新洞察。我们的发现表明,CL能够使模型学习到更均匀分布的用户和物品表示,从而缓解流行度偏差并推广长尾物品。我们的分析还表明,以前认为必要的图形增量相对不可靠且影响有限。基于这些发现,我们提出了一种面向推荐极简图形对比学习方法(XSimGCL),该方法丢弃了无效的图形增量,而采用一种简单而有效的基于噪音的嵌入增量来产生CL视图。对四个大型且高度稀疏的基准数据集进行的全面实验研究表明,尽管所提出的方法非常简单,但它可以平滑地调整学习表示的均匀性并以大幅优异于基于图形增量的对手的推荐准确性和训练效率。代码和使用的数据集发布在https://github.com/Coder-Yu/SELFRec。