Sparse Principal Component Analysis (SPCA) is widely used in data processing and dimension reduction; it uses the lasso to produce modified principal components with sparse loadings for better interpretability. However, sparse PCA never considers an additional grouping structure where the loadings share similar coefficients (i.e., feature grouping), besides a special group with all coefficients being zero (i.e., feature selection). In this paper, we propose a novel method called Feature Grouping and Sparse Principal Component Analysis (FGSPCA) which allows the loadings to belong to disjoint homogeneous groups, with sparsity as a special case. The proposed FGSPCA is a subspace learning method designed to simultaneously perform grouping pursuit and feature selection, by imposing a non-convex regularization with naturally adjustable sparsity and grouping effect. To solve the resulting non-convex optimization problem, we propose an alternating algorithm that incorporates the difference-of-convex programming, augmented Lagrange and coordinate descent methods. Additionally, the experimental results on real data sets show that the proposed FGSPCA benefits from the grouping effect compared with methods without grouping effect.
翻译:粗化主元件分析(SPCA)在数据处理和减少尺寸方面广泛使用;它使用弧线来产生经修改的主要元件,但为了更好的解释性,很少的五氯苯甲醚从未考虑过其他组合结构,其中装载的系数(即特性组)相似(即特性组),除了一个所有系数为零的特殊组别(即特征选择)之外,还考虑其他组合(即特性组别);在本文件中,我们提议了一个称为特异组别和粗化主元件分析(FGSPCA)的新方法,该方法允许装载属于不完全的同质组别组别,而宽度则是一个特例。提议的FGSPCA是一种子空间学习方法,旨在同时进行分组追求和特征选择,其方法是实施非对等式规范,具有自然可调整的宽度和组合效应。为了解决由此产生的非convex优化问题,我们提议一种交替算法,其中纳入了电离子编程、增强拉格朗和协调的血统方法。此外,关于真实数据集的实验结果显示,拟议的FGSPCA从组合效应与不产生组合效果的方法相比,与不产生组合效果。