Collaborative filtering (CF) is a widely studied research topic in recommender systems. The learning of a CF model generally depends on three major components, namely interaction encoder, loss function, and negative sampling. While many existing studies focus on the design of more powerful interaction encoders, the impacts of loss functions and negative sampling ratios have not yet been well explored. In this work, we show that the choice of loss function as well as negative sampling ratio is equivalently important. More specifically, we propose the cosine contrastive loss (CCL) and further incorporate it to a simple unified CF model, dubbed SimpleX. Extensive experiments have been conducted on 11 benchmark datasets and compared with 29 existing CF models in total. Surprisingly, the results show that, under our CCL loss and a large negative sampling ratio, SimpleX can surpass most sophisticated state-of-the-art models by a large margin (e.g., max 48.5% improvement in NDCG@20 over LightGCN). We believe that SimpleX could not only serve as a simple strong baseline to foster future research on CF, but also shed light on the potential research direction towards improving loss function and negative sampling.
翻译:合作过滤(CF)是建议者系统中广泛研究的一个研究课题,学习CF模式一般取决于三个主要组成部分,即互动编码器、损失功能和负抽样。虽然许多现有研究侧重于设计更强大的互动编码器,但损失功能和负抽样比率的影响尚未很好探讨。在这项工作中,我们表明损失功能的选择和负抽样比率同等重要。更具体地说,我们提议将COSine对比损失(CCL)进一步纳入一个简单的统一CF模式,称为SlaimX。已经对11个基准数据集进行了广泛的实验,并与现有的29个CFF模型总共进行了比较。令人惊讶的是,研究结果表明,在我们CCL损失和大量负抽样比率下,SimicleX可以大大超过最尖端的先进模型(例如,NDCG@20比LightGCN改进了48.5 % )。我们认为,SIC不仅可以作为简单有力的基准,促进未来对CF的负面研究,而且还可以降低潜在研究方向的光线。