Collaborative filtering (CF) is an important approach for recommendation system which is widely used in a great number of aspects of our life, heavily in the online-based commercial systems. One popular algorithms in CF is the K-nearest neighbors (KNN) algorithm, in which the similarity measures are used to determine nearest neighbors of a user, and thus to quantify the dependency degree between the relative user/item pair. Consequently, CF approach is not just sensitive to the similarity measure, yet it is completely contingent on selection of that measure. While Jaccard - as one of those commonly used similarity measures for CF tasks - concerns the existence of ratings, other numerical measures such as cosine and Pearson concern the magnitude of ratings. Particularly speaking, Jaccard is not a dominant measure, but it is long proven to be an important factor to improve any measure. Therefore, in our continuous efforts to find the most effective similarity measures for CF, this research focuses on proposing new similarity measure via combining Jaccard with several numerical measures. The combined measures would take the advantages of both existence and magnitude. Experimental results on, Movie-lens dataset, showed that the combined measures are preeminent outperforming all single measures over the considered evaluation metrics.
翻译:合作过滤系统(CF)是建议系统的一个重要方法,在我们生活中的许多方面广泛使用,主要是在网上商业系统中。CF中的一种流行算法是K近邻算法(KNN),其中使用相似性措施确定用户最近的邻居,从而量化相对用户/项目配对之间的依赖度。因此,CF方法不仅敏感于类似计量,但完全取决于该计量的选定。JacCard(作为通常用于CF任务的类似计量之一)涉及评级的存在,而CFC和Pearson等其他数字计量法则涉及评级的大小。特别是,JacCard不是主要衡量标准,但长期证明是改进任何计量的重要因素。因此,在我们不断努力寻找最有效的类似计量方法的过程中,CFCF的这一研究侧重于通过将Jacard与若干数字计量方法相结合提出新的相似度。合并措施将既具有优势,又具有规模。实验结果,例如Cosine and Pearson(Cos)和CFS-LIDDDDDDD)显示所有衡量标准的综合措施。