We study regression discontinuity designs in which many covariates, possibly much more than the number of observations, are available. We consider a two-step algorithm which first selects the set of covariates to be used through a localized Lasso-type procedure, and then, in a second step, estimates the treatment effect by including the selected covariates into the usual local linear estimator. We provide an in-depth analysis of the algorithm's theoretical properties, showing that, under an approximate sparsity condition, the resulting estimator is asymptotically normal, with asymptotic bias and variance that are conceptually similar to those obtained in low-dimensional settings. Bandwidth selection and inference can be carried out using standard methods. We also provide simulations and an empirical application.
翻译:我们研究回归不连续性设计,其中有许多共变性,可能比观测次数多得多。我们考虑一种两步算法,首先选择通过局部拉索型程序使用的一组共变性,然后第二步,通过将选定的共变性纳入通常的本地线性估测器来估计治疗效果。我们深入分析了算法的理论属性,表明在大致的宽度条件下,由此产生的估计值是无常的,在概念上与在低维环境中获得的相似的无常偏差和差异。可以使用标准方法进行宽幅选择和推断。我们还提供模拟和实验应用。