Understanding the impact of the most effective policies or treatments on a response variable of interest is desirable in many empirical works in economics, statistics and other disciplines. Due to the widespread winner's curse phenomenon, conventional statistical inference assuming that the top policies are chosen independent of the random sample may lead to overly optimistic evaluations of the best policies. In recent years, given the increased availability of large datasets, such an issue can be further complicated when researchers include many covariates to estimate the policy or treatment effects in an attempt to control for potential confounders. In this manuscript, to simultaneously address the above-mentioned issues, we propose a resampling-based procedure that not only lifts the winner's curse in evaluating the best policies observed in a random sample, but also is robust to the presence of many covariates. The proposed inference procedure yields accurate point estimates and valid frequentist confidence intervals that achieve the exact nominal level as the sample size goes to infinity for multiple best policy effect sizes. We illustrate the finite-sample performance of our approach through Monte Carlo experiments and two empirical studies, evaluating the most effective policies in charitable giving and the most beneficial group of workers in the National Supported Work program.
翻译:在许多经济学、统计和其他学科的经验性著作中,最好了解最有效的政策或处理方法对响应变量的影响。由于赢家的诅咒现象十分普遍,传统的统计推论认为,如果选择顶级政策时不受随机抽样的影响,则可能会导致对最佳政策作出过于乐观的评价。近年来,由于大型数据集的可用性增加,如果研究人员包括许多共同变量,估算政策或治疗效应,以试图控制潜在的混淆者,那么这个问题可能会更加复杂。在本稿中,为了同时解决上述问题,我们建议一种基于复苏的程序,不仅在评价随机抽样所观察到的最佳政策时解除赢家的诅咒,而且对许多共变体的存在也很有力。拟议的推论程序得出准确的点数估计和可靠的经常信任间隔,从而在抽样规模达到多种最佳政策效果的极限时达到确切的名义水平。我们通过蒙特卡洛实验和两项实证研究,评估慈善捐赠方面最有效的政策,以及国家工人最有利方案小组。