特征分组和粗简本构成部分分析 (Feature Grouping and Sparse Principal Component Analysis)

Sparse Principal Component Analysis (SPCA) is widely used in data processing and dimension reduction; it uses the lasso to produce modified principal components with sparse loadings for better interpretability. However, sparse PCA never considers an additional grouping structure where the loadings share similar coefficients (i.e., feature grouping), besides a special group with all coefficients being zero (i.e., feature selection). In this paper, we propose a novel method called Feature Grouping and Sparse Principal Component Analysis (FGSPCA) which allows the loadings to belong to disjoint homogeneous groups, with sparsity as a special case. The proposed FGSPCA is a subspace learning method designed to simultaneously perform grouping pursuit and feature selection, by imposing a non-convex regularization with naturally adjustable sparsity and grouping effect. To solve the resulting non-convex optimization problem, we propose an alternating algorithm that incorporates the difference-of-convex programming, augmented Lagrange and coordinate descent methods. Additionally, the experimental results on real data sets show that the proposed FGSPCA benefits from the grouping effect compared with methods without grouping effect.

翻译：粗化主元件分析(SPCA)在数据处理和减少尺寸方面广泛使用;它使用弧线来产生经修改的主要元件,但为了更好的解释性,很少的五氯苯甲醚从未考虑过其他组合结构,其中装载的系数(即特性组)相似(即特性组),除了一个所有系数为零的特殊组别(即特征选择)之外,还考虑其他组合(即特性组别);在本文件中,我们提议了一个称为特异组别和粗化主元件分析(FGSPCA)的新方法,该方法允许装载属于不完全的同质组别组别,而宽度则是一个特例。提议的FGSPCA是一种子空间学习方法,旨在同时进行分组追求和特征选择,其方法是实施非对等式规范,具有自然可调整的宽度和组合效应。为了解决由此产生的非convex优化问题,我们提议一种交替算法,其中纳入了电离子编程、增强拉格朗和协调的血统方法。此外,关于真实数据集的实验结果显示,拟议的FGSPCA从组合效应与不产生组合效果的方法相比,与不产生组合效果。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日