Nowadays, recommender systems already impact almost every facet of peoples lives. To provide personalized high quality recommendation results, conventional systems usually train pointwise rankers to predict the absolute value of objectives and leverage a distinct shallow tower to estimate and alleviate the impact of position bias. However, with such a training paradigm, the optimization target differs a lot from the ranking metrics valuing the relative order of top ranked items rather than the prediction precision of each item. Moreover, as the existing system tends to recommend more relevant items at higher positions, it is difficult for the shallow tower based methods to precisely attribute the user feedback to the impact of position or relevance. Therefore, there exists an exciting opportunity for us to get enhanced performance if we manage to solve the aforementioned issues. Unbiased learning to rank algorithms, which are verified to model the relative relevance accurately based on noisy feedback, are appealing candidates and have already been applied in many applications with single categorical labels, such as user click signals. Nevertheless, the existing unbiased LTR methods cannot properly handle multiple feedback incorporating both categorical and continuous labels. Accordingly, we design a novel unbiased LTR algorithm to tackle the challenges, which innovatively models position bias in the pairwise fashion and introduces the pairwise trust bias to separate the position bias, trust bias, and user relevance explicitly. Experiment results on public benchmark datasets and internal live traffic show the superior results of the proposed method for both categorical and continuous labels.
翻译:目前,推荐者系统已经影响到了人们生活的每一个方面。为了提供个性化高品质建议结果,传统系统通常会训练点心排级员来预测目标的绝对价值,并利用不同的浅塔来估计和减轻职位偏差的影响。然而,在这样的培训范式下,优化目标与评价排名最高项目的相对顺序而不是每个项目的预测精确度的等级标准有很大的差别。此外,由于现有系统倾向于在较高职位上建议更相关的项目,浅塔基础方法很难准确地将用户反馈与职位或相关性的影响联系起来。因此,如果我们设法解决上述问题,我们有一个令人振奋的机会来提高业绩。没有偏见的算法学习等级算法,这些算法经过核实,以根据噪音反馈准确地模拟相对相关性,吸引候选人,并已在许多应用程序中应用单一的绝对标签,例如用户点击信号。然而,现有的不偏倚的LTR方法无法正确处理包含直线和连续标签的多重反馈。因此,我们设计了一个新的不偏倚的LTR算法来应对挑战,如果我们设法解决上述问题,那么,我们有一个令人兴奋的提高性LTR算术机会获得更高性的业绩。 。在用户信心的判断性模型中,从而明确地展示了对用户的判断性判断性判断性判断性判断性判断,从而显示用户的准确性判断性判断性判断性对等信任性对等的信任,从而显示了对正确性对等的信任,从而显示的准确性对等的信任,从而显示的正确性判断性对准了对准了对准了对准了对准了对准了对准了对准了对准度,对准了对准了对准了对准度,对准了对准度,对准了对准了对准了对准了对准了对准了对准了对准了对准了对准了对准了对准了对准了。