Incorporating graph side information into recommender systems has been widely used to better predict ratings, but relatively few works have focused on theoretical guarantees. Ahn et al. (2018) firstly characterized the optimal sample complexity in the presence of graph side information, but the results are limited due to strict, unrealistic assumptions made on the unknown latent preference matrix and the structure of user clusters. In this work, we propose a new model in which 1) the unknown latent preference matrix can have any discrete values, and 2) users can be clustered into multiple clusters, thereby relaxing the assumptions made in prior work. Under this new model, we fully characterize the optimal sample complexity and develop a computationally-efficient algorithm that matches the optimal sample complexity. Our algorithm is robust to model errors and outperforms the existing algorithms in terms of prediction performance on both synthetic and real data.
翻译:将图形侧信息纳入推荐人系统已被广泛用于更好地预测评级,但相对较少的作品侧重于理论保障。 Ahn 等人(2018年)首先对图形侧信息中的最佳样本复杂性进行了定性,但由于对未知的潜在偏好矩阵和用户群结构所作的严格、不现实的假设,结果有限。在这项工作中,我们提出了一个新模型,其中1)未知的潜在偏好矩阵可以具有任何离散值,2)用户可以分成多个组群,从而放松先前工作中的假设。在这一新模型下,我们充分描述最佳样本复杂性,并开发一种与最佳样本复杂性相匹配的计算效率算法。我们的算法非常强大,可以模拟错误,在合成数据和真实数据的预测性能方面优于现有的算法。