We present an end-to-end methodological framework for causal segment discovery that aims to uncover differential impacts of treatments across subgroups of users in large-scale digital experiments. Building on recent developments in causal inference and non/semi-parametric statistics, our approach unifies two objectives: (1) the discovery of user segments that stand to benefit from a candidate treatment based on subgroup-specific treatment effects, and (2) the evaluation of causal impacts of dynamically assigning units to a study's treatment arm based on their predicted segment-specific benefit or harm. Our proposal is model-agnostic, capable of incorporating state-of-the-art machine learning algorithms into the estimation procedure, and is applicable in randomized A/B tests and quasi-experiments. An open source R package implementation, sherlock, is introduced.
翻译:我们为因果部分发现提出了一个端到端的方法框架,旨在发现大规模数字实验中各用户分组之间待遇的不同影响。根据因果推断和非/半参数统计的最新发展,我们的方法使两个目标一致:(1) 发现可受益于基于子分组特定待遇效果的候选治疗的用户部分,(2) 根据预测的因子特定利益或伤害,对动态分配单位到研究的治疗部分的因果影响进行评估。我们的提议是模型的不可知性,能够将最新机器学习算法纳入估算程序,并适用于随机的A/B测试和准实验。引入了开放源R软件包实施,Sherlock。