Knowledge Graphs (KGs) have been integrated in several models of recommendation to augment the informational value of an item by means of its related entities in the graph. Yet, existing datasets only provide explicit ratings on items and no information is provided about user opinions of other (non-recommendable) entities. To overcome this limitation, we introduce a new dataset, called the MindReader, providing explicit user ratings both for items and for KG entities. In this first version, the MindReader dataset provides more than 102 thousands explicit ratings collected from 1,174 real users on both items and entities from a KG in the movie domain. This dataset has been collected through an online interview application that we also release open source. As a demonstration of the importance of this new dataset, we present a comparative study of the effect of the inclusion of ratings on non-item KG entities in a variety of state-of-the-art recommendation models. In particular, we show that most models, whether designed specifically for graph data or not, see improvements in recommendation quality when trained on explicit non-item ratings. Moreover, for some models, we show that non-item ratings can effectively replace item ratings without loss of recommendation quality. This finding, thanks also to an observed greater familiarity of users towards common KG entities than towards long-tail items, motivates the use of KG entities for both warm and cold-start recommendations.
翻译:知识图(KGs)已被纳入数种建议模式,目的是通过图表中相关实体的相关实体来增加一个项目的信息价值。然而,现有的数据集只对项目提供明确的评级,而没有提供关于其他(不可推荐的)实体用户意见的信息。为了克服这一限制,我们引入了一个称为MindReader的新数据集,为项目和KG实体提供明确的用户评级。在第一个版本中,MindReader数据集提供了从电影领域一个KG项目和实体的1 174名实际用户那里收集的超过10万个明确的评级。这个数据集是通过一个在线访问应用程序收集的,我们也发布开放源。为了证明这一新数据集的重要性,我们对将非项目KG实体的评级纳入各种最先进的建议模式的影响进行了比较研究。特别是,在第一个版本中,MindReader数据集提供了从电影领域一个KG项目和实体的1 174名实际用户那里收集的超过10万个明确的评级。在对明确非项目评级进行培训时,建议的质量得到了改进。此外,对于一些模型来说,我们通过在线访问程序显示,K级的不甚甚易理解的项目,我们也能够有效地将K公司的评级用于更接近于对普通的评级。