Mean-based estimators of the causal effect in a completely randomized experiment (e.g., the difference-in-means estimator) may behave poorly if the potential outcomes have a heavy-tail, or contain outliers. We study an alternative estimator by Rosenbaum that estimates the constant additive treatment effect by inverting a randomization test using ranks. By investigating the breakdown point and asymptotic relative efficiency of this rank-based estimator, we show that it is provably robust against heavy-tailed potential outcomes, and has variance that is asymptotically, in the worst case, at most about 1.16 times that of the difference-in-means estimator; and its variance can be much smaller when the potential outcomes are not light-tailed. We further derive a consistent estimator of the asymptotic standard error for Rosenbaum's estimator which yields a readily computable confidence interval for the treatment effect. Further, we study a regression adjusted version of Rosenbaum's estimator to incorporate additional covariate information in randomization inference. We prove gain in efficiency by this regression adjustment method under a linear regression model. We illustrate through synthetic and real data that, unlike the mean-based estimators, these rank-based estimators (both unadjusted or regression adjusted) are efficient and robust against heavy-tailed distributions, contamination, and model misspecification.
翻译:暂无翻译