顾问系统自我监督学习:调查 (Self-Supervised Learning for Recommender Systems: A Survey)

Neural architecture-based recommender systems have achieved tremendous success in recent years. However, when dealing with highly sparse data, they still fall short of expectation. Self-supervised learning (SSL), as an emerging technique to learn with unlabeled data, recently has drawn considerable attention in many fields. There is also a growing body of research proceeding towards applying SSL to recommendation for mitigating the data sparsity issue. In this survey, a timely and systematical review of the research efforts on self-supervised recommendation (SSR) is presented. Specifically, we propose an exclusive definition of SSR, on top of which we build a comprehensive taxonomy to divide existing SSR methods into four categories: contrastive, generative, predictive, and hybrid. For each category, the narrative unfolds along its concept and formulation, the involved methods, and its pros and cons. Meanwhile, to facilitate the development and evaluation of SSR models, we release an open-source library SELFRec, which incorporates multiple benchmark datasets and evaluation metrics, and has implemented a number of state-of-the-art SSR models for empirical comparison. Finally, we shed light on the limitations in the current research and outline the future research directions.

翻译：近年来,基于神经结构的推荐系统取得了巨大的成功,然而,当处理高度稀少的数据时,它们仍然没有达到预期值。自我监督的学习(SSL)作为一种以未贴标签的数据进行学习的新兴技术,最近在许多领域引起了相当大的关注。还有越来越多的研究在应用SSL来实施关于减轻数据偏狭问题的建议方面正在展开。在这次调查中,提出了对自监督建议(SSR)的研究工作的及时和系统审查。具体地说,我们提出了一个关于SSR的专属定义。我们在该定义上建立了一个综合的分类学,将现有的SSR方法分为四类:对比性、基因化、预测性和混合性。关于每一类的叙述在其概念和拟订、所涉方法及其赞成和反对意见方面展开。与此同时,为了便利发展和评价SSR模型,我们发布了一个开放源图书馆SELFRec,其中包括多个基准数据集和评价指标,并且已经实施了一些用于经验比较的状态设计模式。最后,我们揭示了目前研究方向和今后研究方向。