When estimating causal effects, it is important to assess external validity, i.e., determine how useful a given study is to inform a practical question for a specific target population. One challenge is that the covariate distribution in the population underlying a study may be different from that in the target population. If some covariates are effect modifiers, the average treatment effect (ATE) may not generalize to the target population. To tackle this problem, we propose new methods to generalize or transport the ATE from a source population to a target population, in the case where the source and target populations have different sets of covariates. When the ATE in the target population is identified, we propose new doubly robust estimators and establish their rates of convergence and limiting distributions. Under regularity conditions, the doubly robust estimators provably achieve the efficiency bound and are locally asymptotic minimax optimal. A sensitivity analysis is provided when the identification assumptions fail. Simulation studies show the advantages of the proposed doubly robust estimator over simple plug-in estimators. Importantly, we also provide minimax lower bounds and higher-order estimators of the target functionals. The proposed methods are applied in transporting causal effects of dietary intake on adverse pregnancy outcomes from an observational study to the whole U.S. female population.
翻译:在估计因果效应时,评估外部有效性即确定给定研究对于特定目标人群提供多大的实用价值是很重要的。其中的一个挑战是,研究在人群中的协变量分布可能与目标人群中有所不同。如果某些协变量是效应修正因素,则平均处理效应(ATE)可能无法泛化到目标人群。为了解决这个问题,我们提出了新的方法,将ATE从源人群传输到目标人群,即在源人群和目标人群有不同协变量集的情况下。当在目标人群中确定了ATE时,我们提出了新的双重稳健估计器,并确定了它们的收敛速度和极限分布。在正则条件下,双重稳健估计器能够可靠地达到有效性上限,并且局部渐近极小化最优。当识别假设失败时,我们提供了敏感度分析。模拟研究显示所提出的双重稳健估计器相对于简单的插值估计器具有优势。重要的是,我们还提供目标函数的最小化下限和高阶估计器。我们应用所提出的方法将膳食摄入对不良妊娠结果的因果效应从一项观察研究传输到整个美国女性人群。