Empirical researchers are increasingly faced with rich data sets containing many controls or instrumental variables, making it essential to choose an appropriate approach to variable selection. In this paper, we provide results for valid inference after post- or orthogonal $L_2$-Boosting is used for variable selection. We consider treatment effects after selecting among many control variables and instrumental variable models with potentially many instruments. To achieve this, we establish new results for the rate of convergence of iterated post-$L_2$-Boosting and orthogonal $L_2$-Boosting in a high-dimensional setting similar to Lasso, i.e., under approximate sparsity without assuming the beta-min condition. These results are extended to the 2SLS framework and valid inference is provided for treatment effect analysis. We give extensive simulation results for the proposed methods and compare them with Lasso. In an empirical application, we construct efficient IVs with our proposed methods to estimate the effect of pre-merger overlap of bank branch networks in the US on the post-merger stock returns of the acquirer bank.
翻译:经验研究人员越来越多地面临包含许多控制或工具变量的丰富的数据集,因此有必要选择对变量选择适当的选择方法。在本文中,我们提供了在变数选择中使用了后或正方方位 $L_2$-boosting后的有效推断结果。我们考虑在选择许多控制变量和工具可能有很多工具的辅助变量之后的处理效果。为此,我们为迭代后L_2$-boosting和正方位 $L_2$-boosting在类似于Lasso的高维环境中的趋同率设定新结果,即,在不假定乙型条件的情况下,在接近紧张状态下提供结果。这些结果延伸至2SLSS框架,并为处理效果分析提供了有效的推断结果。我们为拟议的方法提供广泛的模拟结果,并将其与Lasso比较。在一项经验应用中,我们用我们提议的方法构建高效的四边,以估计美国银行分支网络在合并前对收购银行的合并后库存回报的影响。