In health and social sciences, it is critically important to identify subgroups of the study population where there is notable heterogeneity of treatment effects (HTE) with respect to the population average. Decision trees have been proposed and commonly adopted for data-driven discovery of HTE due to their high level of interpretability. However, single-tree discovery of HTE can be unstable and oversimplified. This paper introduces Causal Rule Ensemble (CRE), a new method for HTE discovery and estimation through an ensemble-of-trees approach. CRE offers several key features, including 1) an interpretable representation of the HTE; 2) the ability to explore complex heterogeneity patterns; and 3) high stability in subgroups discovery. The discovered subgroups are defined in terms of interpretable decision rules. Estimation of subgroup-specific causal effects is performed via a two-stage approach for which we provide theoretical guarantees. Via simulations, we show that the CRE method is highly competitive when compared to state-of-the-art techniques. Finally, we apply CRE to discover the heterogeneous health effects of exposure to air pollution on mortality for 35.3 million Medicare beneficiaries across the contiguous U.S.
翻译:暂无翻译