Interactive recommendation that models the explicit interactions between users and the recommender system has attracted a lot of research attentions in recent years. Most previous interactive recommendation systems only focus on optimizing recommendation accuracy while overlooking other important aspects of recommendation quality, such as the diversity of recommendation results. In this paper, we propose a novel recommendation model, named \underline{D}iversity-promoting \underline{D}eep \underline{R}einforcement \underline{L}earning (D$^2$RL), which encourages the diversity of recommendation results in interaction recommendations. More specifically, we adopt a Determinantal Point Process (DPP) model to generate diverse, while relevant item recommendations. A personalized DPP kernel matrix is maintained for each user, which is constructed from two parts: a fixed similarity matrix capturing item-item similarity, and the relevance of items dynamically learnt through an actor-critic reinforcement learning framework. We performed extensive offline experiments as well as simulated online experiments with real world datasets to demonstrate the effectiveness of the proposed model.
翻译:模拟用户与推荐人系统之间的明确互动,近年来引起了许多研究关注。大多数先前的互动式建议系统仅侧重于优化建议准确性,而忽略建议质量的其他重要方面,例如建议结果的多样性。在本文件中,我们提出了一个新颖的建议模式,名为“下线 {D}diversity-promoting {Dunderline{D}ep {underline{R}underline{R}iness (D$2$RL)”,鼓励建议在互动建议中产生不同的结果。更具体地说,我们采用了“决定点进程”模型,产生多样性,同时提出了相关的项目建议。每个用户都保持了个性化的DPP内核矩阵,该矩阵由两部分组成:一个固定相似的类似矩阵,捕捉项目相似性,以及通过一个行为者-中心强化学习框架动态地学习项目的相关性。我们进行了广泛的离线实验,并用真实的世界数据集模拟了在线实验,以展示拟议模型的有效性。