Recommender system usually faces popularity bias issues: from the data perspective, items exhibit uneven (long-tail) distribution on the interaction frequency; from the method perspective, collaborative filtering methods are prone to amplify the bias by over-recommending popular items. It is undoubtedly critical to consider popularity bias in recommender systems, and existing work mainly eliminates the bias effect. However, we argue that not all biases in the data are bad -- some items demonstrate higher popularity because of their better intrinsic quality. Blindly pursuing unbiased learning may remove the beneficial patterns in the data, degrading the recommendation accuracy and user satisfaction. This work studies an unexplored problem in recommendation -- how to leverage popularity bias to improve the recommendation accuracy. The key lies in two aspects: how to remove the bad impact of popularity bias during training, and how to inject the desired popularity bias in the inference stage that generates top-K recommendations. This questions the causal mechanism of the recommendation generation process. Along this line, we find that item popularity plays the role of confounder between the exposed items and the observed interactions, causing the bad effect of bias amplification. To achieve our goal, we propose a new training and inference paradigm for recommendation named Popularity-bias Deconfounding and Adjusting (PDA). It removes the confounding popularity bias in model training and adjusts the recommendation score with desired popularity bias via causal intervention. We demonstrate the new paradigm on latent factor model and perform extensive experiments on three real-world datasets. Empirical studies validate that the deconfounded training is helpful to discover user real interests and the inference adjustment with popularity bias could further improve the recommendation accuracy.
翻译:推荐人系统通常面临受欢迎偏差问题:从数据角度看,项目在互动频率上分布不均(长尾),项目在互动频率上分布不均(长尾);从方法角度看,合作过滤方法容易通过过度推荐受欢迎项目来扩大偏差。毫无疑问,考虑推荐人系统中的受欢迎偏差至关重要,而现有工作主要消除偏差效应。然而,我们认为,数据中并非所有偏差都是坏的 -- -- 一些项目因其内在质量较高而表现出更高的受欢迎程度。盲目追求公正学习可能会消除数据中的有益模式,降低建议准确性和用户满意度。这项工作研究了建议中一个未解决的问题 -- -- 如何利用受欢迎偏差来提高建议准确性。关键在于两个方面:如何消除在培训系统中偏差偏差的坏影响,以及如何在产生高端建议阶段引入预期的受欢迎偏差。 质疑产生建议过程的因果关系机制。 沿着这条线,我们发现,模型偏差的受欢迎度可以消除数据披露项目与已观察到的用户互动,从而造成偏差的反比效应。为了实现我们的目标,我们提议在培训过程中如何消除受欢迎偏差性,我们通过培训的底级,我们建议。