减少因病推断的差异 (Variance Reduction for Causal Inference)

Propensity score methods have been shown to be powerful in obtaining efficient estimators of average treatment effect (ATE) from observational data, especially under the existence of confounding factors. When estimating, deciding which type of covariates need to be included in the propensity score function is important, since incorporating some unnecessary covariates may amplify both bias and variance of estimators of ATE. In this paper, we show that including additional instrumental variables that satisfy the exclusion restriction for outcome will do harm to the statistical efficiency. Also, we prove that, controlling for covariates that appear as outcome predictors, i.e. predict the outcomes and are irrelevant to the exposures, can help reduce the asymptotic variance of ATE estimation. We also note that, efficiently estimating the ATE by non-parametric or semi-parametric methods require the estimated propensity score function, as described in Hirano et al. (2003)\cite{Hirano2003}. Such estimation procedure usually asks for many regularity conditions, Rothe (2016)\cite{Rothe2016} also illustrated this point and proposed a known propensity score (KPS) estimator that requires mild regularity conditions and is still fully efficient. In addition, we introduce a linearly modified (LM) estimator that is nearly efficient in most general settings and need not estimation of the propensity score function, hence convenient to calculate. The construction of this estimator borrows idea from the interaction estimator of Lin (2013)\cite{Lin2013}, in which regression adjustment with interaction terms are applied to deal with data arising from a completely randomized experiment. As its name suggests, the LM estimator can be viewed as a linear modification on the IPW estimator using known propensity scores. We will also investigate its statistical properties both analytically and numerically.

翻译：预测性评分方法在从观测数据中获得对平均处理效果(ATE)的高效估计值方面表现得非常有力, 特别是在存在混杂因素的情况下。当估算时, 确定哪些类型的共变值需要包含在惯性评分函数中很重要, 因为引入一些不必要的共变值可能会扩大对ATE估计值的偏差和差异。在本文中, 我们显示, 包含满足结果排除限制的额外工具变量将会损害统计效率。此外, 我们还证明, 控制作为结果预测器出现的共变值, 即预测结果和与曝光无关的共变值。当估算时, 决定需要将哪些类型的共变异值纳入偏差评分函数中, 因为引入一些不必要的共变数或半相偏差变量。高估性估算性程序通常会要求许多常规性估算值, Rothe\cite {Rothe2016} 也展示了这个结果预测值, 并且与曝光性直径直线性对数值的变换值, 也表明这个点应用了这个点, 并提议一个已知的直径直径直径直的计算值。