消除以街区为基础的协作过滤中受评级约束的问题:功能恢复方法 (Conquering the rating bound problem in neighborhood-based collaborative filtering: a function recovery approach)

As an important tool for information filtering in the era of socialized web, recommender systems have witnessed rapid development in the last decade. As benefited from the better interpretability, neighborhood-based collaborative filtering techniques, such as item-based collaborative filtering adopted by Amazon, have gained a great success in many practical recommender systems. However, the neighborhood-based collaborative filtering method suffers from the rating bound problem, i.e., the rating on a target item that this method estimates is bounded by the observed ratings of its all neighboring items. Therefore, it cannot accurately estimate the unobserved rating on a target item, if its ground truth rating is actually higher (lower) than the highest (lowest) rating over all items in its neighborhood. In this paper, we address this problem by formalizing rating estimation as a task of recovering a scalar rating function. With a linearity assumption, we infer all the ratings by optimizing the low-order norm, e.g., the $l_1/2$-norm, of the second derivative of the target scalar function, while remaining its observed ratings unchanged. Experimental results on three real datasets, namely Douban, Goodreads and MovieLens, demonstrate that the proposed approach can well overcome the rating bound problem. Particularly, it can significantly improve the accuracy of rating estimation by 37% than the conventional neighborhood-based methods.

翻译：作为社交网络时代信息过滤的一个重要工具,建议者系统在过去十年中目睹了快速发展。由于从更好的解释性中获益,以街区为基础的协作过滤技术,如亚马逊采用的项目基合作过滤技术,在许多切实可行的建议系统中取得了巨大成功。然而,以街区为基础的合作过滤方法受评级约束问题的影响,即,该方法估计目标项目受所有相邻项目的观察评级的约束,因此,建议者系统无法准确估计目标项目未观察到的评级,如果其地面真实评级实际上高于(低于)其周边所有项目的最高(最低)评级。在本文件中,我们通过将评级估算正规化为恢复卡路里评级功能的任务来解决这一问题。根据线性假设,我们通过优化低排序规范(例如,以美元为基值的1/2-norm)来推算所有评级。因此,它无法准确估计目标标标的二次衍生值,如果其地面真实评级实际上高于其所有项目的最高(最低)评级。在本文中,我们通过将评级正规评级评估方法(即Doubal-rass)大幅改进了三个真实评级方法。