Strong correlations between explanatory variables are problematic for high-dimensional regularized regression methods. Due to the violation of the Irrepresentable Condition, the popular LASSO method may suffer from false inclusions of inactive variables. In this paper, we propose pre-processing with orthogonal decompositions (PROD) for the explanatory variables in high-dimensional regressions. The PROD procedure is constructed based upon a generic orthogonal decomposition of the design matrix. We demonstrate by two concrete cases that the PROD approach can be effectively constructed for improving the performance of high-dimensional penalized regression. Our theoretical analysis reveals their properties and benefits for high-dimensional penalized linear regression with LASSO. Extensive numerical studies with simulations and data analysis show the promising performance of the PROD.
翻译:解释变量之间的强烈关联对于高维常规回归方法来说是成问题的。由于违反了不可代表的回归条件,流行的LASSO方法可能会受到不活动变量的错误纳入的影响。在本文中,我们提议对高维回归中的解释变量进行正方形分解(PROD)预处理。PROD程序是根据设计矩阵的通用正方形分解构建的。我们通过两个具体案例证明,PROD方法可以有效地构建,以改善高维受处罚回归的性能。我们的理论分析揭示了它们与LASSO一起的高维度受处罚的线性回归的特性和好处。广泛的数字研究,加上模拟和数据分析,显示了PROD的良好表现。