In health and social sciences, it is critically important to identify subgroups of the study population where a treatment has notable heterogeneity in the causal effects with respect to the average treatment effect. Data-driven discovery of heterogeneous treatment effects (HTE) via decision tree methods has been proposed for this task. Despite its high interpretability, the single-tree discovery of HTE tends to be highly unstable and to find an oversimplified representation of treatment heterogeneity. To accommodate these shortcomings, we propose Causal Rule Ensemble (CRE), a new method to discover heterogeneous subgroups through an ensemble-of-trees approach. CRE has the following features: 1) provides an interpretable representation of the HTE; 2) allows extensive exploration of complex heterogeneity patterns; and 3) guarantees high stability in the discovery. The discovered subgroups are defined in terms of interpretable decision rules, and we develop a general two-stage approach for subgroup-specific conditional causal effects estimation, providing theoretical guarantees. Via simulations, we show that the CRE method has a strong discovery ability and a competitive estimation performance when compared to state-of-the-art techniques. Finally, we apply CRE to discover subgroups most vulnerable to the effects of exposure to air pollution on mortality for 35.3 million Medicare beneficiaries across the contiguous U.S.
翻译:在卫生和社会科学方面,至关重要的是要确定研究人群的分组,在这些分组中,某种治疗在平均治疗效果的因果关系方面具有明显的差异性。为这项任务,提出了通过决策树方法以数据驱动的方式发现不同治疗效应(HTE)的建议。尽管其可解释性很高,但HTE的单树发现往往极不稳定,并找到治疗差异性过分简单化的表述。为适应这些缺陷,我们提议了一种因果规则结合(CRE),这是通过混合树类方法发现不同分组的新方法。CRE具有以下特点:(1) 提供可解释的HTE代表;(2) 允许广泛探索复杂的异质性模式;(3) 保证发现高度稳定。发现分组的定义是可解释的决定规则,我们为子分组特定有条件的因果关系估计制定了一般的两阶段方法,提供理论保证。我们通过Via模拟,显示CRE方法具有很强的发现能力和竞争性估计性能,与MINFS的脆弱分层技术相比,我们在整个C-FIDS的脆弱大气接触影响应用了C-C-C-C-C-C-C-C-Iental-Iental-Iental Teal-Iental-Iental-Iental-Iental-ID-I)</s>