A completely randomized experiment allows us to estimate the causal effect by the difference in the averages of the outcome under the treatment and control. But, difference-in-means type estimators behave poorly if the potential outcomes are heavy-tailed, or contain a few outliers. We study an alternative estimator by Rosenbaum that estimates the causal effect by inverting a randomization test using ranks. By calculating the asymptotic breakdown point of this estimator, we show that it is provably more robust than the difference-in-means estimator. We obtain the limiting distribution of this estimator and develop a framework to compare the efficiencies of different estimators of the treatment effect in the setting of randomized experiments. In particular, we show that the asymptotic variance of Rosenbaum's estimator is, in the worst case, about 1.16 times the variance of the difference-in-means estimator, and can be much smaller when the potential outcomes are not light-tailed. Further, we propose a regression adjusted version of Rosenbaum's estimator to incorporate additional covariate information in randomization inference. We prove gain in efficiency by this regression adjustment method under a linear regression model. Finally, we illustrate through synthetic and real data that these rank-based estimators, regression adjusted or unadjusted, are efficient and robust against heavy-tailed distributions, contamination, and model misspecification.
翻译:完全随机的实验让我们可以估计结果在治疗和控制下的平均值差异的因果关系。 但是, 如果潜在结果是重尾的, 或者包含一些外星值, 平均值上的差别估计者表现不好。 我们研究罗森堡的替代估计者, 该估计者通过用等级来颠倒随机化测试来估计因果关系。 通过计算这个估计者无症状的分解点, 我们显示, 它比在数值上的差异估计者所表现出的强得多。 我们获得了这个估计者的限制分布, 并且开发了一个框架, 比较了在随机实验中不同治疗效果估计者的效率。 我们特别研究罗森堡的估算者在使用等级上通过随机测试来估计因果关系的因果关系。 我们通过计算模型中基于数值的偏差的分解点, 并且当潜在结果不简单一致时, 我们建议对罗森堡的偏差进行精确分布, 并且通过这种精确度的精确度的精确度, 我们用这种精确度的精确度来分析, 我们用这种精确性的方法, 我们用这种精确性的方法 来模拟的精确的回归分析, 我们用这种精确性的方法, 将这些精确性的方法纳入。