Factor and sparse models are two widely used methods to impose a low-dimensional structure in high-dimension. They are seemingly mutually exclusive. In this paper, we propose a simple lifting method that combines the merits of these two models in a supervised learning methodology that allows to efficiently explore all the information in high-dimensional datasets. The method is based on a flexible model for panel data, called factor-augmented regression model with both observable, latent common factors, as well as idiosyncratic components as high-dimensional covariate variables. This model not only includes both factor regression and sparse regression as specific models but also significantly weakens the cross-sectional dependence and hence facilitates model selection and interpretability. The methodology consists of three steps. At each step, the remaining cross-section dependence can be inferred by a novel test for covariance structure in high-dimensions. We developed asymptotic theory for the factor-augmented sparse regression model and demonstrated the validity of the multiplier bootstrap for testing high-dimensional covariance structure. This is further extended to testing high-dimensional partial covariance structures. The theory and methods are further supported by an extensive simulation study and applications to the construction of a partial covariance network of the financial returns and a prediction exercise for a large panel of macroeconomic time series from FRED-MD database.
翻译:系数和稀少模型是将低维结构强加于高维共变体的两个广泛使用的方法。 它们似乎相互排斥。 在本文中,我们提出一个简单的提升方法,将这两个模型的优点结合到一个监督的学习方法中,以便能够有效地探索高维数据集中的所有信息。该方法基于一个灵活的小组数据模型,称为因子增强回归模型,既包括可见的、潜在的共同因素,也包括作为高维共变体变量的特异性组合元件。该模型不仅包括作为具体模型的因素回归和稀释回归,而且还大大削弱了跨部门依赖性,从而便利了模型的选择和可解释性。该方法由三个步骤组成。在每一个步骤中,其余的跨部门依赖性都可以通过高维数据集的共变结构的新颖测试来推断。我们为因子放大的微弱共变数模型开发了零度理论,并展示了用于测试高维异结构的倍化靴壳的有效性。这进一步扩展了测试高维度部分共变数结构,从而降低了跨部门依赖性,从而便利了模式的选取模式的模型选择和可解释性模型。 在每一步中,一个大型的宏观经济变数数据库应用中,进一步支持了一种广泛的宏观经济模型的理论和共变数分析模型。