CTR prediction, which aims to estimate the probability that a user will click an item, plays a crucial role in online advertising and recommender system. Feature interaction modeling based and user interest mining based methods are the two kinds of most popular techniques that have been extensively explored for many years and have made great progress for CTR prediction. However, (1) feature interaction based methods which rely heavily on the co-occurrence of different features, may suffer from the feature sparsity problem (i.e., many features appear few times); (2) user interest mining based methods which need rich user behaviors to obtain user's diverse interests, are easy to encounter the behavior sparsity problem (i.e., many users have very short behavior sequences). To solve these problems, we propose a novel module named Dual Graph enhanced Embedding, which is compatible with various CTR prediction models to alleviate these two problems. We further propose a Dual Graph enhanced Embedding Neural Network (DG-ENN) for CTR prediction. Dual Graph enhanced Embedding exploits the strengths of graph representation with two carefully designed learning strategies (divide-and-conquer, curriculum-learning-inspired organized learning) to refine the embedding. We conduct comprehensive experiments on three real-world industrial datasets. The experimental results show that our proposed DG-ENN significantly outperforms state-of-the-art CTR prediction models. Moreover, when applying to state-of-the-art CTR prediction models, Dual graph enhanced embedding always obtains better performance. Further case studies prove that our proposed dual graph enhanced embedding could alleviate the feature sparsity and behavior sparsity problems. Our framework will be open-source based on MindSpore in the near future.
翻译:CTR预测旨在估计用户点击某个项目的概率,在在线广告和建议系统中发挥着关键作用。基于采矿的特效互动模型和用户兴趣的采矿方法是多年来广泛探索的两种最受欢迎的技术,在CTR预测方面取得了巨大进展。然而,(1) 基于互动的方法非常依赖不同特征的共同发生,可能会受到特征偏差问题的影响(即,许多特征似乎很少);(2)基于用户兴趣的采矿方法,这些方法需要丰富的用户图表行为,才能获得用户的不同兴趣,很容易遇到行为紧张问题(即,许多用户的行为序列非常短)。为了解决这些问题,我们提议了一个名为“双图增强的嵌入模型”,与各种CTR预测模型相兼容,以缓解这两个问题。我们进一步提议为CTR预测的双重图强化内嵌网络(DG-ENN) 。DB在使用两个精心设计的图表显示的未来演示策略(接近和稳定模型,许多用户的行为顺序非常短。)为了解决这些问题,我们提议了一个名为“双重图增强的内嵌”的内嵌化模型,在不断改进我们的数据模型上,在不断改进的内嵌入的C。我们的数据中,我们提出了“改进的内嵌化的C” 改进了“改进了“我们”的内存数据 改进了“我们的数据 改进了“我们” 改进了“数据 改进了” 改进了“数据” 改进了“数据” 改进了“我们” 改进了“我们” 改进了“我们” 改进了“我们” 。