Most recommender systems optimize the model on observed interaction data, which is affected by the previous exposure mechanism and exhibits many biases like popularity bias. The loss functions, such as the mostly used pointwise Binary Cross-Entropy and pairwise Bayesian Personalized Ranking, are not designed to consider the biases in observed data. As a result, the model optimized on the loss would inherit the data biases, or even worse, amplify the biases. For example, a few popular items take up more and more exposure opportunities, severely hurting the recommendation quality on niche items -- known as the notorious Mathew effect. In this work, we develop a new learning paradigm named Cross Pairwise Ranking (CPR) that achieves unbiased recommendation without knowing the exposure mechanism. Distinct from inverse propensity scoring (IPS), we change the loss term of a sample -- we innovatively sample multiple observed interactions once and form the loss as the combination of their predictions. We prove in theory that this way offsets the influence of user/item propensity on the learning, removing the influence of data biases caused by the exposure mechanism. Advantageous to IPS, our proposed CPR ensures unbiased learning for each training instance without the need of setting the propensity scores. Experimental results demonstrate the superiority of CPR over state-of-the-art debiasing solutions in both model generalization and training efficiency. The codes are available at https://github.com/Qcactus/CPR.
翻译:多数推荐者系统优化了观测到的互动数据模型,该模型受到先前的接触机制的影响,并表现出了像普惠性偏见这样的许多偏差。损失功能,如多数使用的点对点双双双双双双双双双双双双双对巴伊西亚个性分级,其设计上的损失功能不是为了考虑观测到的数据的偏差。因此,对损失进行优化的模型将继承数据偏差,甚至更糟糕地扩大偏差。例如,一些受欢迎的项目利用了越来越多的接触机会,严重损害了特殊项目的建议质量 -- -- 即臭名昭著的Matheew效应。在这项工作中,我们开发了名为Cross Pairwith Righting(CPR)的新学习模式,在不了解曝光机制的情况下实现了公正的建议。不同于反偏差的偏差评分(IPS),我们改变了抽样的损失术语 -- -- 我们创新地抽样了多次观察到的交互互动,并形成损失的组合了预测。我们从理论上证明,这种方式抵消了用户/项目对学习的热情,消除了数据偏差的影响。在暴露机制中造成的影响。对IP国分分率,我们提议的高分率,我们提出的高分级研究要求确保了每一项的实验的学习的学习结果。