Generalizing causal estimates in randomized experiments to a broader target population is essential for guiding decisions by policymakers and practitioners in the social and biomedical sciences. While recent papers developed various weighting estimators for the population average treatment effect (PATE), many of these methods result in large variance because the experimental sample often differs substantially from the target population, and estimated sampling weights are extreme. To improve efficiency in practice, we propose post-residualized weighting in which we use the outcome measured in the observational population data to build a flexible predictive model (e.g., machine learning methods) and residualize the outcome in the experimental data before using conventional weighting methods. We show that the proposed PATE estimator is consistent under the same assumptions required for existing weighting methods, importantly without assuming the correct specification of the predictive model. We demonstrate the efficiency gains from this approach through simulations and our application based on a set of job training experiments.
翻译:将随机实验的因果关系估计归纳到更广泛的目标人群中,对于指导社会和生物医学决策者和从业者的决策至关重要。虽然最近的论文为人口平均治疗效应制定了各种加权估计值(PATE),但许多这些方法都造成了很大的差异,因为实验抽样往往与目标人群有很大差异,估计抽样加权数是极端的。为了提高实践效率,我们提议采用再恢复加权法,利用观察人口数据中测量的结果来建立一个灵活的预测模型(例如机器学习方法),在使用常规加权法之前将实验数据的结果加以保留。我们表明,拟议的PATE估计值是在现行加权法所要求的相同假设下一致的,重要的是,没有假定预测模型的正确规格。我们通过模拟和基于一套职业培训实验的应用来证明这一方法的效率收益。