后退效率(未)-经调整的罗森鲍姆在随机化实验中 (Efficiency of Regression (Un)-Adjusted Rosenbaum's Rank-based Estimator in Randomized Experiments)

A completely randomized experiment allows us to estimate the causal effect by the difference in the averages of the outcome under the treatment and control. But, difference-in-means type estimators behave poorly if the potential outcomes have a heavy-tail, or contain a few extreme observations or outliers. We study an alternative estimator by Rosenbaum that estimates the causal effect by inverting a randomization test using ranks. We study the asymptotic properties of this estimator and develop a framework to compare the efficiencies of different estimators of the treatment effect in the setting of randomized experiments. In particular, we show that the Rosenbaum estimator has variance that is asymptotically, in the worst case, at most about 1.16 times the variance of the difference-in-means estimator, and can be much smaller when the potential outcomes are not light-tailed. We further derive a consistent estimator of the asymptotic standard error for the Rosenbaum estimator which immediately yields a readily computable confidence interval for the treatment effect, thereby alleviating the expensive numerical calculations needed to implement the original proposal of Rosenbaum. Further, we propose a regression adjusted version of the Rosenbaum estimator to incorporate additional covariate information in randomization inference. We prove gain in efficiency by this regression adjustment method under a linear regression model. Finally, we illustrate through simulations that, unlike the difference-in-means based estimators, either unadjusted or regression adjusted, these rank-based estimators are efficient and robust against heavy-tailed distributions, contamination, and various model misspecifications.

翻译：完全随机的实验让我们可以估计结果在治疗和控制下结果平均值差异的因果关系。但是,如果潜在结果有重尾,或者含有一些极端的观察或外部值,那么,平均值差异的估测器表现不好。我们研究罗森堡的替代估测器,该估测器估计因随机测试使用等级而导致的因果关系。我们研究这个估测器的无症状特性,并开发一个框架,以比较在随机实验中测得治疗效果的不同估测器的效率。特别是,如果潜在结果有重尾数,或者包含一些极端的观察或外部值。我们研究一个替代的估测器,通过随机测算器的无症状标准错误来比较不同比率的处理效果。我们显示罗森堡的估测器要么是随机测算器, 从而在最坏的情况下, 将最贵的测算器的测算法纳入更贵的测算器, 并且通过更精确的测算法, 进一步推算出一个更精确的精确的测算法。