Recommender systems have become an integral part of online platforms, providing personalized suggestions for purchasing items, consuming contents, and connecting with individuals. An online recommender system consists of two sides of components: the producer side comprises product sellers, content creators, or service providers, etc., and the consumer side includes buyers, viewers, or guests, etc. To optimize an online recommender system, A/B tests serve as the golden standard for comparing different ranking models and evaluating their impact on both the consumers and producers. While consumer-side experiments are relatively straightforward to design and commonly used to gauge the impact of ranking changes on the behavior of consumers (buyers, viewers, etc.), designing producer-side experiments presents a considerable challenge because producer items in the treatment and control groups need to be ranked by different models and then merged into a single ranking for the recommender to show to each consumer. In this paper, we review issues with the existing methods, propose new design principles for producer-side experiments, and develop a rigorous solution based on counterfactual interleaving designs for accurately measuring the effects of ranking changes on the producers (sellers, creators, etc.).
翻译:暂无翻译