Optimizing recommender systems based on user interaction data is mainly seen as a problem of dealing with selection bias, where most existing work assumes that interactions from different users are independent. However, it has been shown that in reality user feedback is often influenced by earlier interactions of other users, e.g. via average ratings, number of views or sales per item, etc. This phenomenon is known as the bandwagon effect. In contrast with previous literature, we argue that the bandwagon effect should not be seen as a problem of statistical bias. In fact, we prove that this effect leaves both individual interactions and their sample mean unbiased. Nevertheless, we show that it can make estimators inconsistent, introducing a distinct set of problems for convergence in relevance estimation. Our theoretical analysis investigates the conditions under which the bandwagon effect poses a consistency problem and explores several approaches for mitigating these issues. This work aims to show that the bandwagon effect poses an underinvestigated open problem that is fundamentally distinct from the well-studied selection bias in recommendation.
翻译:以用户互动数据为基础的优化推荐系统主要被视为一个处理选择偏差的问题,因为大多数现有工作都假定不同用户的互动是独立的,然而,已经表明,在现实中,用户反馈往往受到其他用户早期互动的影响,例如通过平均评级、每个项目的意见数量或销售量等。 这种现象被称为“带宽效应”。与以往的文献不同,我们认为,不应将带宽效应视为统计偏差问题。事实上,我们证明,这种效应使个人互动及其抽样具有公正性。然而,我们表明,这种效果可以使估计结果产生不一致,为相关估计的趋同提出一系列不同的问题。我们的理论分析调查了带宽效应造成一致性问题的条件,并探讨了缓解这些问题的若干办法。这项工作旨在表明,带宽效应是一个调查不足的开放问题,与建议中经过很好研究的选择偏差有着根本的区别。