Estimating post-click conversion rate (CVR) accurately is crucial for ranking systems in industrial applications such as recommendation and advertising. Conventional CVR modeling applies popular deep learning methods and achieves state-of-the-art performance. However it encounters several task-specific problems in practice, making CVR modeling challenging. For example, conventional CVR models are trained with samples of clicked impressions while utilized to make inference on the entire space with samples of all impressions. This causes a sample selection bias problem. Besides, there exists an extreme data sparsity problem, making the model fitting rather difficult. In this paper, we model CVR in a brand-new perspective by making good use of sequential pattern of user actions, i.e., impression -> click -> conversion. The proposed Entire Space Multi-task Model (ESMM) can eliminate the two problems simultaneously by i) modeling CVR directly over the entire space, ii) employing a feature representation transfer learning strategy. Experiments on dataset gathered from Taobao's recommender system demonstrate that ESMM significantly outperforms competitive methods. We also release a sampling version of this dataset to enable future research. To the best of our knowledge, this is the first public dataset which contains samples with sequential dependence of click and conversion labels for CVR modeling.
翻译:精确地估计点击后转换率(CVR)对于建议和广告等工业应用中的排名系统至关重要。常规CVR模型采用流行的深层次学习方法,并实现最新性能。但它在实践中遇到若干任务特有的问题,使得CVR模型具有挑战性。例如,常规CVR模型使用点击后印象样本进行训练,同时利用所有印象样本对整个空间进行推断。这造成了抽样选择偏差问题。此外,还存在极端的数据广度问题,使模型变得相当困难。在本文中,我们从全新的角度对CVR模型进行建模,很好地使用用户行动的顺序模式,即:印象 - > 点击 - > 转换。拟议的Entire空间多任务模型(ESMMM)可以同时消除两个问题,一) 在整个空间直接建模CVR模型,二) 采用特征代表传输学习战略。从Taobao建议系统收集的数据设置实验,使得模型的模型变得相当困难。我们从全新的角度模型模型模型模型模型模型模型模型的实验显示,我们从品牌的角度对CVR的模型进行建模新模型进行建模式的模型的模型的模型的模型的模型进行大幅超越了我们的数据转换。我们的数据,我们将数据转换为最具有竞争力的样样样样样样样样样样样版。我们的数据的样样板的样板的样板的样的样板的样板的样板。我们还把数据放的样板的样的样的样的样的样的样的样的样的样板用于了我们的样板的样板的样板的样板。