Researchers often run resource-intensive randomized controlled trials (RCTs) to estimate the causal effects of interventions on outcomes of interest. Yet these outcomes are often noisy, and estimated overall effects can be small or imprecise. Nevertheless, we may still be able to produce reliable evidence of the efficacy of an intervention by finding subgroups with significant effects. In this paper, we propose a machine-learning method that is specifically optimized for finding such subgroups in noisy data. Unlike available methods for personalized treatment assignment, our tool is fundamentally designed to take significance testing into account: it produces a subgroup that is chosen to maximize the probability of obtaining a statistically significant positive treatment effect. We provide a computationally efficient implementation using decision trees and demonstrate its gain over selecting subgroups based on positive (estimated) treatment effects. Compared to standard tree-based regression and classification tools, this approach tends to yield higher power in detecting subgroups affected by the treatment.
翻译:暂无翻译