在反事实学习中的立场和信任比值的混合校正 (Mixture-Based Correction for Position and Trust Bias in Counterfactual Learning to Rank)

In counterfactual learning to rank (CLTR) user interactions are used as a source of supervision. Since user interactions come with bias, an important focus of research in this field lies in developing methods to correct for the bias of interactions. Inverse propensity scoring (IPS) is a popular method suitable for correcting position bias. Affine correction (AC) is a generalization of IPS that corrects for position bias and trust bias. IPS and AC provably remove bias, conditioned on an accurate estimation of the bias parameters. Estimating the bias parameters, in turn, requires an accurate estimation of the relevance probabilities. This cyclic dependency introduces practical limitations in terms of sensitivity, convergence and efficiency. We propose a new correction method for position and trust bias in CLTR in which, unlike the existing methods, the correction does not rely on relevance estimation. Our proposed method, mixture-based correction (MBC), is based on the assumption that the distribution of the CTRs over the items being ranked is a mixture of two distributions: the distribution of CTRs for relevant items and the distribution of CTRs for non-relevant items. We prove that our method is unbiased. The validity of our proof is not conditioned on accurate bias parameter estimation. Our experiments show that MBC, when used in different bias settings and accompanied by different LTR algorithms, outperforms AC, the state-of-the-art method for correcting position and trust bias, in some settings, while performing on par in other settings. Furthermore, MBC is orders of magnitude more efficient than AC in terms of the training time.

翻译：在对用户进行反事实学习以排名(CLTR)用户互动时,使用反事实学习作为监督的来源。由于用户互动带有偏差,该领域的一个重要研究焦点在于制定纠正互动偏差的方法。反倾向评分(IPS)是适合纠正定位偏差的流行方法。Affie 校正(AC)是IPS的一种概括,它纠正了位置偏差和信任偏差。IPS和AC可以准确地消除偏差,条件是准确估计偏差参数。估计偏差参数反过来,要求准确估计相关性的概率。这种周期性依赖性在敏感度、趋同和效率方面引入了实际限制。我们为CLTR中的位置和信任偏差提出了一种新的纠正方法,与现有方法不同,纠正偏差并不依赖相关估计。我们提出的方法、基于混合物的校正(MBC)基于以下假设:CTR在排名项目上的分布是两种分布的混合物:相关项目的分布以及CTR位置的分布在敏感度、趋正(C)和对不相偏差的项目的分布,我们在不精确的A级的测试中,我们采用不同的方法是准确的正确性。