Recent years have witnessed the great success of self-supervised learning (SSL) in recommendation systems. However, SSL recommender models are likely to suffer from spurious correlations, leading to poor generalization. To mitigate spurious correlations, existing work usually pursues ID-based SSL recommendation or utilizes feature engineering to identify spurious features. Nevertheless, ID-based SSL approaches sacrifice the positive impact of invariant features, while feature engineering methods require high-cost human labeling. To address the problems, we aim to automatically mitigate the effect of spurious correlations. This objective requires to 1) automatically mask spurious features without supervision, and 2) block the negative effect transmission from spurious features to other features during SSL. To handle the two challenges, we propose an invariant feature learning framework, which first divides user-item interactions into multiple environments with distribution shifts and then learns a feature mask mechanism to capture invariant features across environments. Based on the mask mechanism, we can remove the spurious features for robust predictions and block the negative effect transmission via mask-guided feature augmentation. Extensive experiments on two datasets demonstrate the effectiveness of the proposed framework in mitigating spurious correlations and improving the generalization abilities of SSL models.
翻译:近些年来,自监督学习(SSL)在建议系统中取得了巨大成功,然而,SSL建议模型很可能受到虚假的关联,导致笼统化。为了减轻虚假的关联性,现有工作通常追求基于ID的SSL建议,或利用特征工程来辨别虚假特征。然而,基于ID的SSL方法牺牲了异变特征的积极影响,而特征工程方法则需要高成本的人类标签。为了解决问题,我们的目标是自动减轻虚假关联效应的影响。这个目标要求1)在没有监督的情况下自动遮盖虚假特征,2)阻止负面效应从虚假特征传播到SSL期间的其他特征。为了应对这两个挑战,我们提议了一个不变化特征学习框架,首先将用户项目的互动分成多种环境,然后随着分布的变化而学习一个特征掩码机制,以便捕捉跨环境的异性特征。根据掩码机制,我们可以消除强力预测的模糊特征,并阻止通过掩码特征增强的负效应传播。在两个数据模型上进行广泛的实验,展示SLLSL框架在改进总体模型方面的有效性。