While popularity bias is recognized to play a crucial role in recommmender (and other ranking-based) systems, detailed analysis of its impact on collective user welfare has largely been lacking. We propose and theoretically analyze a general mechanism, rooted in many of the models proposed in the literature, by which item popularity, item quality, and position bias jointly impact user choice. We focus on a standard setting in which user utility is largely driven by item quality, and a recommender attempts to estimate it given user behavior. Formulating the problem as a non-stationary contextual bandit, we study the ability of a recommender policy to maximize user welfare under this model. We highlight the importance of exploration, not to eliminate popularity bias, but to mitigate its negative impact on welfare. We first show that naive popularity-biased recommenders induce linear regret by conflating item quality and popularity. More generally, we show that, even in linear settings, identifiability of item quality may not be possible due to the confounding effects of popularity bias. However, under sufficient variability assumptions, we develop an efficient optimistic algorithm and prove efficient regret guarantees w.r.t. user welfare. We complement our analysis with several simulation studies, which demonstrate the negative impact of popularity bias on the performance of several natural recommender policies.
翻译:暂无翻译