受监督的线性尺寸降低方法:审查、扩展和比较 (Supervised Linear Dimension-Reduction Methods: Review, Extensions, and Comparisons)

Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input variable that contains maximal variation and preserves as much information as possible. PCA has also been used in prediction models where the original, high-dimensional space of predictors is reduced to a smaller, more manageable, set before conducting regression analysis. However, this approach does not incorporate information in the response during the dimension-reduction stage and hence can have poor predictive performance. To address this concern, several supervised linear dimension-reduction techniques have been proposed in the literature. This paper reviews selected techniques, extends some of them, and compares their performance through simulations. Two of these techniques, partial least squares (PLS) and least-squares PCA (LSPCA), consistently outperform the others in this study.

翻译：主要元件分析(PCA)是一种众所周知的线性分层减少法,在数据分析和建模中广泛使用;这是一种未经监督的学习技术,为输入变量确定一个合适的线性子空间,其中含有最大变异并尽可能保存大量信息;在预测模型中也使用了五氯苯甲醚,其中最初的高维预测器空间缩小到较小、更易于管理,在进行回归分析之前设置;然而,这一方法没有将反应中的信息纳入分层减少阶段,因此预测性性能差;为解决这一关切,在文献中提出了几种受监督的线性分层减少技术;本文审查了选定的技术,扩展了其中一些技术,并通过模拟比较了这些技术的性能;其中两种技术,即部分最小平方和最小方的五氯苯甲醚(LSPCA),始终优于本研究中的其他技术。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日