Uplift modeling is a rapidly growing approach that utilizes causal inference and machine learning methods to directly estimate the heterogeneous treatment effects, which has been widely applied to various online marketplaces to assist large-scale decision-making in recent years. The existing popular models, like causal forest (CF), are limited to either discrete treatments or posing parametric assumptions on the outcome-treatment relationship that may suffer model misspecification. However, continuous treatments (e.g., price, duration) often arise in marketplaces. To alleviate these restrictions, we use a kernel-based doubly robust estimator to recover the non-parametric dose-response functions that can flexibly model continuous treatment effects. Moreover, we propose a generic distance-based splitting criterion to capture the heterogeneity for the continuous treatments. We call the proposed algorithm generalized causal forest (GCF) as it generalizes the use case of CF to a much broader setting. We show the effectiveness of GCF by deriving the asymptotic property of the estimator and comparing it to popular uplift modeling methods on both synthetic and real-world datasets. We implement GCF on Spark and successfully deploy it into a large-scale online pricing system at a leading ride-sharing company. Online A/B testing results further validate the superiority of GCF.
翻译:升级模型是一种迅速扩大的方法,它利用因果关系推断和机器学习方法,直接估计不同治疗效果,近年来在各种在线市场广泛应用这一方法,以协助大规模决策。现有的流行模式,如因果森林(CF),限于离散处理,或对结果治疗关系提出可能受到模式错误区分的参数假设;然而,在市场上经常出现持续的治疗(如价格、持续时间),为了减轻这些限制,我们使用一个以内核为基础的双倍强大的估计器,以恢复非参数剂量反应功能,这种功能可以灵活地模拟持续治疗效果。此外,我们提出一个通用的基于距离的分离标准,以捕捉持续治疗的异质性。我们称拟议的算法通用因果森林(GCF),因为它将使用CFC的情况概括到范围更广得多的地方。我们通过推断估量器的无损性属性和将其与在合成和现实世界一级一级上成功测试A型号系统的结果测试全球合作框架。