Historical interactions are the default choice for recommender model training, which typically exhibit high sparsity, i.e., most user-item pairs are unobserved missing data. A standard choice is treating the missing data as negative training samples and estimating interaction likelihood between user-item pairs along with the observed interactions. In this way, some potential interactions are inevitably mislabeled during training, which will hurt the model fidelity, hindering the model to recall the mislabeled items, especially the long-tail ones. In this work, we investigate the mislabeling issue from a new perspective of aleatoric uncertainty, which describes the inherent randomness of missing data. The randomness pushes us to go beyond merely the interaction likelihood and embrace aleatoric uncertainty modeling. Towards this end, we propose a new Aleatoric Uncertainty-aware Recommendation (AUR) framework that consists of a new uncertainty estimator along with a normal recommender model. According to the theory of aleatoric uncertainty, we derive a new recommendation objective to learn the estimator. As the chance of mislabeling reflects the potential of a pair, AUR makes recommendations according to the uncertainty, which is demonstrated to improve the recommendation performance of less popular items without sacrificing the overall performance. We instantiate AUR on three representative recommender models: Matrix Factorization (MF), LightGCN, and VAE from mainstream model architectures. Extensive results on two real-world datasets validate the effectiveness of AUR w.r.t. better recommendation results, especially on long-tail items.
翻译:历史互动是推荐者模式培训的默认选择, 通常显示高度偏差, 即大多数用户- 项目配对都是未观察到的缺失数据。 标准选择将缺失的数据作为负面培训样本处理, 并估计用户- 项目配对与观察到的互动可能性。 这样, 在培训期间, 某些潜在互动不可避免地被错误标记, 这会损害模型的忠诚, 妨碍模型召回错误标签错误的项目, 特别是长尾项目 。 在这项工作中, 我们从测读不确定性的新角度来调查错误标签问题, 描述缺失数据的内在随机性。 随机性促使我们超越互动的可能性, 并接受测读不确定性模型的模型。 朝着这个目的, 我们提议一个新的测读不准确性建议框架, 包括一个新的不确定性估计符, 以及一个正常建议模型。 根据测算不确定性的理论, 我们提出了一个新的建议目标, 测算测算缺失数据误标结果。 作为误标定结果的概率, 明确度A- 显示总体性能的不确定性, 显示一个比标值项目, 更精确度的模型, 显示一个比标值的模型, 更精确度项目 。 显示的是, 更精确的模型, 显示, 精确度模型, 更精确度 显示, 精确度 度 显示, 显示整个性标值项目 显示整个性标值项目 度建议。