Blocking, a special case of rerandomization, is routinely implemented in the design stage of randomized experiments to balance baseline covariates. Regression adjustment is highly encouraged in the analysis stage to adjust for the remaining covariate imbalances. Researchers have recommended combining these techniques; however, the research on this combination in a randomization-based inference framework with a large number of covariates is limited. This paper proposes several methods that combine the blocking, rerandomization, and regression adjustment techniques in randomized experiments with high-dimensional covariates. In the design stage, we suggest the implementation of blocking or rerandomization or both techniques to balance a fixed number of covariates most relevant to the outcomes. For the analysis stage, we propose regression adjustment methods based on the Lasso to adjust for the remaining imbalances in the additional high-dimensional covariates. Moreover, we establish the asymptotic properties of the proposed Lasso-adjusted average treatment effect estimators and outline conditions under which these estimators are more efficient than the unadjusted estimators. In addition, we provide conservative variance estimators to facilitate valid inferences. Our analysis is randomization-based, allowing the outcome data generating models to be mis-specified. Simulation studies and two real data analyses demonstrate the advantages of the proposed methods.
翻译:在随机试验的设计阶段,通常会采用阻断、重新调节和回归调整技术,以平衡基线的共差。在分析阶段,大力鼓励回旋调整,以适应其余的共差不平衡。研究人员建议将这些技术结合起来;然而,在随机的任意推断框架中,在大量共差的情况下,对这种结合进行研究是有限的。本文提出了几种方法,这些方法结合了高维共差随机试验中的阻截、重新调节和回归调整技术。在设计阶段,我们建议采用阻断或重新调整技术或两种技术,以平衡固定数量的与结果最相关的共差。在分析阶段,我们建议以激光索为基础的回归调整方法,以适应其他高维差变数中剩余的不平衡。此外,我们确定了拟议的激光调整平均处理效果估计和概要的无干扰性条件,使这些估计结果比未经调整的估测器更有效率。此外,我们提供了保守的差异估计和两种技术来平衡与结果的固定数。在分析阶段,我们提出了以Lasso为基础的数据模拟和模拟结果分析是有效的。我们提议的模拟结果分析结果分析。