Factor and sparse models are two widely used methods to impose a low-dimensional structure in high-dimension. They are seemingly mutually exclusive. We propose a lifting method that combines the merits of these two models in a supervised learning methodology that allows to efficiently explore all the information in high-dimensional datasets. The method is based on a flexible model for high-dimensional panel data, called factor-augmented regression (FarmPredict) model with both observable or latent common factors, as well as idiosyncratic components. This model not only includes both principal component (factor) regression and sparse regression as specific models but also significantly weakens the cross-sectional dependence and hence facilitates model selection and interpretability. The methodology consists of three steps. At each step, the remaining cross-section dependence can be inferred by a novel test for covariance structure in high-dimensions. We developed asymptotic theory for the FarmPredict model and demonstrated the validity of the multiplier bootstrap for testing high-dimensional covariance structure. This is further extended to testing high-dimensional partial covariance structures. The theory is supported by a simulation study and applications to the construction of a partial covariance network of the financial returns and a prediction exercise for a large panel of macroeconomic time series from FRED-MD database.
翻译:系数和稀少模型是将低维结构强加给高维数据集的两种广泛使用的方法。它们看起来是相互排斥的。我们建议采用一种提升方法,将这两个模型的优点结合到一种监督的学习方法中,以便能够有效地探索高维数据集中的所有信息。这种方法基于一种灵活的高维面面板数据模型,称为要素增强(FarmPredict)模型,该模型既具有可观测或潜在的共同因素,也具有异质共性成分。这一模型不仅包括主要组成部分(成份)回归和稀疏回归作为具体模型,而且还大大削弱了跨部门依赖性,从而便利了模型的选择和可解释性。该方法由三个步骤组成。在每一个步骤中,其余的跨部门依赖性都可以通过高维面数据集的共变性结构的新测试来推断。我们为FarmPredict模型开发了“静态”理论,并展示了用于测试高维度变异结构的乘性靴壳的有效性。这一模型还进一步扩展了测试高维度部分易变性结构,从而降低了跨部门依赖性,从而便利了模型的选择和可解释性。该方法包括三个步骤。在高维度数据库的模型中,通过模拟和大规模宏观经济变数数据库中的一项理论得到支持。