Many bipartite networks describe systems where a link represents a relation between a user and an item. Measuring the similarity between either users or items is the basis of memory-based collaborative filtering, a widely used method to build a recommender system with the purpose of proposing items to users. When the edges of the network are unweighted, traditional approaches allow only positive similarity values, so neglecting the possibility and the effect of two users (or two items) being very dissimilar. Here we propose a method to compute similarity that allows also negative values, the Sapling Similarity. The key idea is to look at how the information that a user is connected to an item influences our prior estimation of the probability that another user is connected to the same item: if it is reduced, then the similarity between the two users will be negative, otherwise it will be positive. Using different datasets, we show that the Sapling Similarity outperforms other similarity metrics when it is used to recommend new items to users.
翻译:许多双部分网络描述一个链接代表一个用户和一个项目之间关系的系统。 测量用户或项目之间的相似性是基于内存的协作过滤的基础, 这是用来建立推荐者系统的一种广泛使用的方法, 目的是向用户提出项目。 当网络的边缘没有加权, 传统方法只允许积极的相似值, 从而忽略了两个用户( 或两个项目)非常不同的可能性和效果。 我们在这里提出了一个计算相似性的方法, 允许负值, 即 Sapling 相似性。 关键的想法是查看一个用户连接到一个项目的信息如何影响我们先前对另一个用户连接到同一项目的概率的估计: 如果缩小, 那么两个用户之间的相似性将是负的, 否则将是正的。 使用不同的数据集, 我们显示Sapling 相似性在用它向用户推荐新项目时比其它相似性指标要高。