In this paper, we develop invariance-based procedures for testing and inference in high-dimensional regression models. These procedures, also known as randomization tests, provide several important advantages. First, for the global null hypothesis of significance, our test is valid in finite samples. It is also simple to implement and comes with finite-sample guarantees on statistical power. Remarkably, despite its simplicity, this testing idea has escaped the attention of earlier analytical work, which mainly concentrated on complex high-dimensional asymptotic methods. Under an additional assumption of Gaussian design, we show that this test also achieves the minimax optimal rate against certain nonsparse alternatives, a type of result that is rare in the literature. Second, for partial null hypotheses, we propose residual-based tests and derive theoretical conditions for their validity. These tests can be made powerful by constructing the test statistic in a way that, first, selects the important covariates (e.g., through Lasso) and then orthogonalizes the nuisance parameters. We illustrate our results through extensive simulations and applied examples. One consistent finding is that the strong finite-sample guarantees associated with our procedures result in added robustness when it comes to handling multicollinearity and heavy-tailed covariates.
翻译:暂无翻译