Self-supervised learning (SSL), which can automatically generate ground-truth samples from raw data, holds vast potential to improve recommender systems. Most existing SSL-based methods perturb the raw data graph with uniform node/edge dropout to generate new data views and then conduct the self-discrimination based contrastive learning over different views to learn generalizable representations. Under this scheme, only a bijective mapping is built between nodes in two different views, which means that the self-supervision signals from other nodes are being neglected. Due to the widely observed homophily in recommender systems, we argue that the supervisory signals from other nodes are also highly likely to benefit the representation learning for recommendation. To capture these signals, a general socially-aware SSL framework that integrates tri-training is proposed in this paper. Technically, our framework first augments the user data views with the user social information. And then under the regime of tri-training for multi-view encoding, the framework builds three graph encoders (one for recommendation) upon the augmented views and iteratively improves each encoder with self-supervision signals from other users, generated by the other two encoders. Since the tri-training operates on the augmented views of the same data sources for self-supervision signals, we name it self-supervised tri-training. Extensive experiments on multiple real-world datasets consistently validate the effectiveness of the self-supervised tri-training framework for improving recommendation. The code is released at https://github.com/Coder-Yu/QRec.
翻译:自我监督学习(SSL)可以自动生成原始数据中的地面真相样本,它具有改进推荐系统的巨大潜力。大多数现有的SSL基础方法都以统一节点/隐蔽退出的方式对原始数据图表进行干扰,以生成新的数据观点,然后对不同观点进行基于自我歧视的对比学习,以学习可概括的表述。在这个办法下,只有两种不同观点的节点之间建立了双向映射,这意味着其他节点的自我监督信号正在被忽略。由于在推荐系统中广泛观察到的相同性,我们认为,其他节点的监督信号也极有可能使代表学习的建议受益。为了捕捉这些信号,本文件中提议了一个将三重训练纳入三重访问的通用 SSSL框架。在技术上,我们的框架首先用用户的社会信息来增强用户的用户数据观点。在多面编码的三重训练制度下,框架在两个强化的视图上建立三个图形编码(一个用于建议),并反复改进每个节点的监管信号,在自我监督的演示中,在自我监督的自我监督的版本中,从其他用户的自我评估的自我智能源源中,在不断更新自我升级的自我升级的自我智能数据库中,在内部数据源中,在自我更新的自我更新的自我服务器的自我验证中,由我们生成的数据源源源中生成的自我测试的自我验证。