Recommender systems usually rely on observed user interaction data to build personalized recommendation models, assuming that the observed data reflect user interest. However, user interacting with an item may also due to conformity, the need to follow popular items. Most previous studies neglect user's conformity and entangle interest with it, which may cause the recommender systems fail to provide satisfying results. Therefore, from the cause-effect view, disentangling these interaction causes is a crucial issue. It also contributes to OOD problems, where training and test data are out-of-distribution. Nevertheless, it is quite challenging as we lack the signal to differentiate interest and conformity. The data sparsity of pure cause and the items' long-tail problem hinder disentangled causal embedding. In this paper, we propose DCCL, a framework that adopts contrastive learning to disentangle these two causes by sample augmentation for interest and conformity respectively. Futhermore, DCCL is model-agnostic, which can be easily deployed in any industrial online system. Extensive experiments are conducted over two real-world datasets and DCCL outperforms state-of-the-art baselines on top of various backbone models in various OOD environments. We also demonstrate the performance improvements by online A/B testing on Kuaishou, a billion-user scale short-video recommender system.
翻译:建议者系统通常依靠观察到的用户互动数据来建立个性化建议模型,假设观察到的数据反映用户的兴趣。然而,用户与某个项目互动也可能是因为符合标准,需要跟踪受欢迎项目。大多数先前的研究忽视了用户的合规性和与该系统密切相关的兴趣,这可能导致推荐者系统无法提供令人满意的结果。因此,从因果关系的观点来看,分离这些互动原因是一个关键问题。这也促成了OOOD问题,因为培训和测试数据是无法分配的。然而,由于我们缺乏区分兴趣和一致性的信号,它具有相当大的挑战性。纯粹原因的数据的广度和项目的长期问题阻碍了分解的因果关系嵌入。在本文件中,我们建议DCCL是一个框架,采用对比性学习来消除这两个原因,分别通过抽样增加兴趣和兼容性来消除这两个原因。Fothermore,DCCL是一个示范性-不可知性,可以在任何工业在线系统中轻松地部署这些数据。尽管我们缺乏区分兴趣和一致性的信号,但是它还是具有相当大的挑战性。纯粹原因和长尾端问题的数据库问题妨碍了分解的因果关系。在本文件中,我们还在各种主干线测试模型上展示了10亿级的在线测试。