Uplift modeling aims to estimate the treatment effect on individuals, widely applied in the e-commerce platform to target persuadable customers and maximize the return of marketing activities. Among the existing uplift modeling methods, tree-based methods are adept at fitting increment and generalization, while neural-network-based models excel at predicting absolute value and precision, and these advantages have not been fully explored and combined. Also, the lack of counterfactual sample pairs is the root challenge in uplift modeling. In this paper, we proposed an uplift modeling framework based on Knowledge Distillation and Sample Matching (KDSM). The teacher model is the uplift decision tree (UpliftDT), whose structure is exploited to construct counterfactual sample pairs, and the pairwise incremental prediction is treated as another objective for the student model. Under the idea of multitask learning, the student model can achieve better performance on generalization and even surpass the teacher. Extensive offline experiments validate the universality of different combinations of teachers and student models and the superiority of KDSM measured against the baselines. In online A/B testing, the cost of each incremental room night is reduced by 6.5\%.
翻译:提升模型旨在估计个人所受的治疗影响,在电子商务平台上广泛应用,以针对可接受客户,并最大限度地实现营销活动的回报; 在现有的提升模型方法中,植树方法适合适当的增量和概括化,而神经网络模型则擅长预测绝对值和精确度,这些优势尚未充分探讨和结合; 另外,缺乏反事实样本对等是提升模型的根本性挑战; 在本文中,我们提议了一个基于知识蒸馏和抽样匹配的提升模型框架; 教师模型是提升决策树(提升设计),其结构被利用来构建反事实样本配对,而双向渐进预测被作为学生模型的另一个目标。 根据多塔斯卡学习的概念,学生模型可以在普及方面取得更好的表现,甚至超过教师。 广泛的离线实验证实了教师和学生模型的不同组合的普遍性,以及根据基线测量的KDSM的优越性。 在在线A/B测试中,每个递增室的成本由6.5室降低。</s>