Double blind randomized controlled trials are traditionally seen as the gold standard for causal inferences as the difference-in-means estimator is an unbiased estimator of the average treatment effect in the experiment. The fact that this estimator is unbiased over all possible randomizations does not, however, mean that any given estimate is close to the true treatment effect. Similarly, while pre-determined covariates will be balanced between treatment and control groups on average, large imbalances may be observed in a given experiment and the researcher may therefore want to condition on such covariates using linear regression. This paper studies the theoretical properties of both the difference-in-means and OLS estimators \emph{conditional} on observed differences in covariates. By deriving the statistical properties of the conditional estimators, we can establish guidance for how to deal with covariate imbalances. We study both inference with OLS, as well as with a new version of Fisher's exact test, where the randomization distribution comes from a small subset of all possible assignment vectors.
翻译:传统上,双盲随机控制试验被视为因果推断的金标准,因为本币差异估计值是实验中平均治疗效果的不带偏见的估测符。这个估计值对所有可能的随机计算没有偏见,但这一事实并不意味着任何特定估计值接近真正的治疗效果。同样,虽然预先确定的共变法平均在治疗和控制组之间是平衡的,但在特定试验中可能观察到巨大的不平衡,因此,研究者可能希望用线性回归来决定这种共变法。本文研究了在观察到的共变法差异方面,以及OLS估计值和OLS估计值/emph{条件}的理论属性。通过得出有条件估计值的统计属性,我们可以为如何处理共变法失衡问题制定指南。我们既研究与OLS的推论,也研究渔业者精确测试的新版本,其中随机分布来自所有可能分配矢量的一小部分。