Factor and sparse models are two widely used methods to impose a low-dimensional structure in high-dimensions. However, they are seemingly mutually exclusive. We propose a lifting method that combines the merits of these two models in a supervised learning methodology that allows for efficiently exploring all the information in high-dimensional datasets. The method is based on a flexible model for high-dimensional panel data, called factor-augmented regression model with observable and/or latent common factors, as well as idiosyncratic components. This model not only includes both principal component regression and sparse regression as specific models but also significantly weakens the cross-sectional dependence and facilitates model selection and interpretability. The method consists of several steps and a novel test for (partial) covariance structure in high dimensions to infer the remaining cross-section dependence at each step. We develop the theory for the model and demonstrate the validity of the multiplier bootstrap for testing a high-dimensional (partial) covariance structure. The theory is supported by a simulation study and applications.
翻译:系数和稀少模型是将低维结构强加给高维数据集的两种广泛使用的方法。但是,它们似乎相互排斥。我们建议采用一种提升方法,将这两个模型的优点结合到监督的学习方法中,以便能够有效地探索高维数据集中的所有信息。该方法基于一个灵活模式,用于高维面面板数据,称为因子增强回归模型,具有可观测和/或潜在共同因素,以及异质共性成分。该模型不仅包括主要组成部分回归和稀薄回归,作为具体模型,而且还大大削弱了跨部门依赖性,并便利了模型的选择和可解释性。该方法由若干步骤和高维度(部分)共变结构的新测试组成,以推断每个步骤的剩余截面依赖性。我们为该模型制定理论,并展示用于测试高维(部分)差异结构的倍化靴带的有效性。该理论得到模拟研究和应用的支持。