CTR prediction is essential for modern recommender systems. Ranging from early factorization machines to deep learning based models in recent years, existing CTR methods focus on capturing useful feature interactions or mining important behavior patterns. Despite the effectiveness, we argue that these methods suffer from the risk of label sparsity (i.e., the user-item interactions are highly sparse with respect to the feature space), label noise (i.e., the collected user-item interactions are usually noisy), and the underuse of domain knowledge (i.e., the pairwise correlations between samples). To address these challenging problems, we propose a novel Multi-Interest Self-Supervised learning (MISS) framework which enhances the feature embeddings with interest-level self-supervision signals. With the help of two novel CNN-based multi-interest extractors,self-supervision signals are discovered with full considerations of different interest representations (point-wise and union-wise), interest dependencies (short-range and long-range), and interest correlations (inter-item and intra-item). Based on that, contrastive learning losses are further applied to the augmented views of interest representations, which effectively improves the feature representation learning. Furthermore, our proposed MISS framework can be used as an plug-in component with existing CTR prediction models and further boost their performances. Extensive experiments on three large-scale datasets show that MISS significantly outperforms the state-of-the-art models, by up to 13.55% in AUC, and also enjoys good compatibility with representative deep CTR models.
翻译:CTR预测对现代推荐人系统至关重要。 从早期集思广益机器到近年来深层次学习模型,现有CTR方法侧重于捕捉有用的特征互动或挖掘重要的行为模式。尽管效果有效,但我们认为,这些方法存在标签偏狭(即用户-项目互动在功能空间方面高度稀少)、标签噪音(即收集的用户-项目互动通常很吵)和域知识利用不足(即抽样之间的对称相关性)的风险。为了解决这些具有挑战性的问题,我们提议了一个新型的多端自我强化学习(MISS)框架,以强化与利益层面自我监督的自我监督信号嵌入的功能。在两种基于CNN的新颖的多种利益提取器的帮助下,发现自我监督信号时充分考虑到了不同利益表现(点和联盟之间通常很吵)、利益依赖性(即短距离和长距离的对等关系)以及域知识的利用不足。基于这个具有对比性的多端自我监督的学习(MISS)学习(MIS)框架,还进一步运用了现有三级模型的深度模型,从而进一步提升了现有模型的深度分析模型的深度分析结果。