In traditional multivariate data analysis, dimension reduction and regression have been treated as distinct endeavors. Established techniques such as principal component regression (PCR) and partial least squares (PLS) regression traditionally compute latent components as intermediary steps -- although with different underlying criteria -- before proceeding with the regression analysis. In this paper, we introduce an innovative regression methodology named PLS-integrated Lasso (PLS-Lasso) that integrates the concept of dimension reduction directly into the regression process. We present two distinct formulations for PLS-Lasso, denoted as PLS-Lasso-v1 and PLS-Lasso-v2, along with clear and effective algorithms that ensure convergence to global optima. PLS-Lasso-v1 and PLS-Lasso-v2 are compared with Lasso on the task of financial index tracking and show promising results.
翻译:在传统的多元数据分析中,降维与回归通常被视为相互独立的步骤。现有技术如主成分回归(PCR)和偏最小二乘(PLS)回归虽基于不同准则,但均先计算潜在成分作为中间步骤,再进行回归分析。本文提出一种创新的回归方法,称为偏最小二乘集成LASSO(PLS-Lasso),该方法将降维概念直接融入回归过程。我们提出了两种不同的PLS-Lasso形式,分别记为PLS-Lasso-v1和PLS-Lasso-v2,并给出了清晰有效的算法,确保收敛至全局最优解。在金融指数追踪任务中,PLS-Lasso-v1和PLS-Lasso-v2与LASSO进行了比较,并显示出良好的结果。