Estimation of the average treatment effect (ATE) is a central problem in causal inference. In recent times, inference for the ATE in the presence of high-dimensional covariates has been extensively studied. Among the diverse approaches that have been proposed, augmented inverse probability weighting (AIPW) with cross-fitting has emerged as a popular choice in practice. In this work, we study this cross-fit AIPW estimator under well-specified outcome regression and propensity score models in a high-dimensional regime where the number of features and samples are both large and comparable. Under assumptions on the covariate distribution, we establish a new CLT for the suitably scaled cross-fit AIPW that applies without any sparsity assumptions on the underlying high-dimensional parameters. Our CLT uncovers two crucial phenomena among others: (i) the AIPW exhibits a substantial variance inflation that can be precisely quantified in terms of the signal-to-noise ratio and other problem parameters, (ii) the asymptotic covariance between the pre-cross-fit estimates is non-negligible even on the root-n scale. In fact, these cross-covariances turn out to be negative in our setting. These findings are strikingly different from their classical counterparts. On the technical front, our work utilizes a novel interplay between three distinct tools--approximate message passing theory, the theory of deterministic equivalents, and the leave-one-out approach. We believe our proof techniques should be useful for analyzing other two-stage estimators in this high-dimensional regime. Finally, we complement our theoretical results with simulations that demonstrate both the finite sample efficacy of our CLT and its robustness to our assumptions.
翻译:估计平均处理效果( ATE) 是因果推断的一个中心问题。 近些年来, 在高维共变差的假设下, 对ATE的推论已经进行了广泛研究。 在提出的各种办法中, 增加了反概率加权( AIPW), 并进行了交叉校准, 在实践中, 这是流行的选择 。 在这项工作中, 我们研究AIPW 的测算器, 在一个高度的系统里, 以精确确定的结果回归率和偏差评分模型, 其特征和样本的数量既大又可比较。 在对 Coevari 分布的假设下, 我们为适当规模的跨维差的 AIPW 建立了一个新的 CLT 。 在高维参数的假设中, 我们的CIPW 发现了两大关键现象 : (i) AIPW 显示出巨大的差异性通货膨胀, 可以用信号到噪音比率和其他问题参数来精确量化, (ii) 补缺证据系统之间的差差差值 。 在对前代数的估算中, 我们的理论的精确性 将两者置于不同的理论 。