This paper proposes a new approach to training recommender systems called deviation-based learning. The recommender and rational users have different knowledge. The recommender learns user knowledge by observing what action users take upon receiving recommendations. Learning eventually stalls if the recommender always suggests a choice: Before the recommender completes learning, users start following the recommendations blindly, and their choices do not reflect their knowledge. The learning rate and social welfare improve substantially if the recommender abstains from recommending a particular choice when she predicts that multiple alternatives will produce a similar payoff.
翻译:本文件提出了培训推荐者系统的新办法,称为偏差学习。推荐者和理性使用者有不同的知识。推荐者通过观察用户在接受建议时采取什么行动来学习用户的知识。如果推荐者总是建议一种选择,学习最终会拖延时间:在推荐者完成学习之前,用户开始盲目地遵循建议,他们的选择并不反映他们的知识。如果推荐者在预测多种替代方案会产生类似的回报时不建议特定选择,学习率和社会福利就会大为改善。