Learning user preferences for products based on their past purchases or reviews is at the cornerstone of modern recommendation engines. One complication in this learning task is that some users are more likely to purchase products or review them, and some products are more likely to be purchased or reviewed by the users. This non-uniform pattern degrades the power of many existing recommendation algorithms, as they assume that the observed data are sampled uniformly at random among user-product pairs. In addition, existing literature on modeling non-uniformity either assume user interests are independent of the products, or lack theoretical understanding. In this paper, we first model the user-product preferences as a partially observed matrix with non-uniform observation pattern. Next, building on the literature about low-rank matrix estimation, we introduce a new weighted trace-norm penalized regression to predict unobserved values of the matrix. We then prove an upper bound for the prediction error of our proposed approach. Our upper bound is a function of a number of parameters that are based on a certain weight matrix that depends on the joint distribution of users and products. Utilizing this observation, we introduce a new optimization problem to select a weight matrix that minimizes the upper bound on the prediction error. The final product is a new estimator, NU-Recommend, that outperforms existing methods in both synthetic and real datasets. Our approach aims at accurate predictions for all users while prioritizing fairness. To achieve this, we employ a bias-variance tradeoff mechanism that ensures good overall prediction performance without compromising the predictive accuracy for less active users.
翻译:以过去购买或审查为基础的产品学习用户偏好是现代建议引擎的基石。 学习任务的一个复杂之处是,有些用户更有可能购买产品或审查产品,有些产品更有可能被用户购买或审查。 这种非统一模式削弱了许多现有建议算法的力量,因为他们假设观察到的数据在用户-产品配对中统一随机抽样。 此外,关于不一致性的建模的现有文献要么假定用户利益独立于产品,要么缺乏理论理解。 在本文中,我们首先将用户-产品偏好作为部分观察到的带有非统一观察模式的矩阵模型。 下一步,根据低级别矩阵估计的文献,我们采用新的加权跟踪规范抑制回归,以预测未观察到的矩阵值。 然后,我们证明我们拟议方法的预测错误是随机的。 我们的上层界限是一系列参数的函数,这些参数以取决于用户和产品的联合分布。 利用这一观察,我们引入了一个新的优化问题,在不统一观察模式中选择一个重量的准确度矩阵,在不统一观察模式上层的精确度上方,我们选择了一种精确度的准确度,在最终的精确度上方值上方,我们的数据预测是最佳的精确度,在最终的精确度上方,我们为最终的预测。</s>