We explore a hidden feedback loops effect in online recommender systems. Feedback loops result in degradation of online multi-armed bandit (MAB) recommendations to a small subset and loss of coverage and novelty. We study how uncertainty and noise in user interests influence the existence of feedback loops. First, we show that an unbiased additive random noise in user interests does not prevent a feedback loop. Second, we demonstrate that a non-zero probability of resetting user interests is sufficient to limit the feedback loop and estimate the size of the effect. Our experiments confirm the theoretical findings in a simulated environment for four bandit algorithms.
翻译:我们探索了在线推荐人系统中隐藏的反馈回路效应。反馈回路导致在线多武装匪徒(MAB)建议降为小子集,失去覆盖面和新颖之处。我们研究了用户利益中的不确定性和噪音如何影响反馈回路的存在。首先,我们表明,出于用户利益的无偏见添加随机噪音并不妨碍反馈回路。第二,我们证明重新确定用户利益的非零概率足以限制反馈回路并估计其影响大小。我们的实验证实了模拟环境中四个土匪算法的理论结论。