Recommenders provide personalized content recommendations to users. They often suffer from highly skewed long-tail item distributions, with a small fraction of the items receiving most of the user feedback. This hurts model quality especially for the slices without much supervision. Existing work in both academia and industry mainly focuses on re-balancing strategies (e.g., up-sampling and up-weighting), leveraging content features, and transfer learning. However, there still lacks of a deeper understanding of how the long-tail distribution influences the recommendation performance. In this work, we theoretically demonstrate that the prediction of user preference is biased under the long-tail distributions. This bias comes from the discrepancy of both the prior and conditional probabilities between training data and test data. Most existing methods mainly attempt to reduce the bias from the prior perspective, which ignores the discrepancy in the conditional probability. This leads to a severe forgetting issue and results in suboptimal performance. To address the problem, we design a novel Cross Decoupling Network (CDN) to reduce the differences in both prior and conditional probabilities. Specifically, CDN (i) decouples the learning process of memorization and generalization on the item side through a mixture-of-expert structure; (ii) decouples the user samples from different distributions through a regularized bilateral branch network. Finally, a novel adapter is introduced to aggregate the decoupled vectors, and softly shift the training attention to tail items. Extensive experimental results show that CDN significantly outperforms state-of-the-art approaches on popular benchmark datasets, leading to an improvement in HR@50 (hit ratio) of 8.7\% for overall recommendation and 12.4\% for tail items.
翻译:推荐人向用户提供个性化内容建议。 他们通常会遭受高度偏斜的长尾项分布, 且有一小部分项目得到用户反馈。 这伤害了模型质量, 特别是切片的模型质量, 没有太多监督。 学术界和行业的现有工作主要侧重于重新平衡战略( 例如, 抽查和加量)、 调试内容特性和传输学习。 但是, 仍然缺乏对长尾分发如何影响建议性能的更深了解。 在这项工作中, 我们理论上表明, 在长尾分发中, 用户偏好的预测有偏差。 这种偏差来自培训数据和测试数据之间先前和有条件的概率差异。 大多数现有方法主要试图从先前的角度减少偏差, 这忽略了有条件概率的差异。 这导致人们严重忘记问题和亚性表现。 为了解决问题, 我们设计了一个新的 Crosy Decouple 网络( CDN) 来减少先前和有条件的稳定性差异。 具体来说, CDN (i) 常规流流数据分配结构显示的是, 整个IM 结构中一个不同版本的缩缩缩缩缩缩图 。