关于建议产生因果影响的在线评价方法 (Online Evaluation Methods for the Causal Effect of Recommendations)

Evaluating the causal effect of recommendations is an important objective because the causal effect on user interactions can directly leads to an increase in sales and user engagement. To select an optimal recommendation model, it is common to conduct A/B testing to compare model performance. However, A/B testing of causal effects requires a large number of users, making such experiments costly and risky. We therefore propose the first interleaving methods that can efficiently compare recommendation models in terms of causal effects. In contrast to conventional interleaving methods, we measure the outcomes of both items on an interleaved list and items not on the interleaved list, since the causal effect is the difference between outcomes with and without recommendations. To ensure that the evaluations are unbiased, we either select items with equal probability or weight the outcomes using inverse propensity scores. We then verify the unbiasedness and efficiency of online evaluation methods through simulated online experiments. The results indicate that our proposed methods are unbiased and that they have superior efficiency to A/B testing.

翻译：评估建议的因果关系是一个重要目标,因为对用户互动的因果关系可直接导致销售量和用户参与的增加。选择最佳建议模式,通常的做法是进行A/B测试,以比较示范性业绩。然而,对因果关系的测试需要大量用户,使这种实验成本高、风险大。因此,我们提出第一种互连方法,可以有效地比较建议模式的因果关系效果。与传统的互连方法不同,我们衡量两个项目在互连清单上的结果和不在互连名单上的结果,因为因果关系是结果与建议之间的差别。为了确保评价是公正的,我们选择的概率相等的项目,或者用反偏向分数来权衡结果。我们随后通过模拟在线试验来核查在线评价方法的公正性和效率。结果显示,我们提出的方法是不带偏见的,它们比A/B测试更有效率。