This paper presents a new and effective simulation-based approach to conducting both finite- and large- sample inference for high-dimensional linear regression models. We develop this approach under the so-called repro samples framework, in which we conduct statistical inference by creating and studying the behavior of artificial samples that are obtained by mimicking the sampling mechanism of the data. We obtain confidence sets for either the true model, a single, or any collection of regression coefficients. The proposed approach addresses two major gaps in the high-dimensional regression literature: (1) lack of inference approaches that guarantee finite-sample performance; (2) lack of effective approaches to address model selection uncertainty and provide inference for the underlying true model. We provide both finite-sample and asymptotic results to theoretically guarantee the performance of the proposed methods. Besides enjoying theoretical advantages, our numerical results demonstrate that the proposed methods achieve better coverage with smaller confidence sets than the existing state-of-art approaches, such as debiasing and bootstrap approaches. We also extend our approaches to drawing inferences on functions of the regression coefficients.
翻译:本文介绍了对高维线性回归模型进行有限和大样本抽样推断的新的、有效的模拟方法。我们根据所谓的再处理样本框架制定这一方法,我们通过建立和研究模拟数据取样机制获得的人工样本的行为来进行统计推断。我们为真实模型、单一模型或任何回归系数的收集获取信任套套。拟议方法解决了高维回归文献中的两大差距:(1)缺乏保证有限抽样性能的推断方法;(2)缺乏解决模型选择不确定性的有效方法,并为基本真实模型提供推断。我们提供了有限抽样和零处理结果,从理论上保证了拟议方法的绩效。我们的数字结果表明,除了享有理论优势外,拟议方法的覆盖面比现有的最新方法,例如偏差法和靴子捕捉法,要小一些信任套。我们还扩展了对回归系数功能的推断方法。