Personalized recommendation systems (RS) are extensively used in many services. Many of these are based on learning algorithms where the RS uses the recommendation history and the user response to learn an optimal strategy. Further, these algorithms are based on the assumption that the user interests are rigid. Specifically, they do not account for the effect of learning strategy on the evolution of the user interests. In this paper we develop influence models for a learning algorithm that is used to optimally recommend websites to web users. We adapt the model of \cite{Ioannidis10} to include an item-dependent reward to the RS from the suggestions that are accepted by the user. For this we first develop a static optimisation scheme when all the parameters are known. Next we develop a stochastic approximation based learning scheme for the RS to learn the optimal strategy when the user profiles are not known. Finally, we describe several user-influence models for the learning algorithm and analyze their effect on the steady user interests and on the steady state optimal strategy as compared to that when the users are not influenced.
翻译:在许多服务中,个人化推荐系统(RS)被广泛使用,其中许多基于学习算法,其中RS使用推荐历史和用户响应来学习最佳战略。此外,这些算法基于用户兴趣僵硬的假设。具体地说,这些算法没有考虑到学习战略对用户兴趣演变的影响。在本文中,我们为学习算法开发了影响模型,用于向网络用户最佳推荐网站。我们从用户接受的建议中调整了计算模型,以便向RS提供取决于项目的报酬。为此,我们首先在所有参数都已知的情况下开发了静态优化方案。接下来,我们为RS开发了一个基于学习战略的随机近似化学习计划,以便在用户概况不为人所知时学习最佳战略。最后,我们描述了几种用户影响模型,用于学习算法,并分析其对固定用户兴趣和稳定状态最佳战略的影响,而用户没有受到影响时,我们比较了几种用户对稳定用户兴趣和稳定状态最佳战略的影响。