通过Revealing QR因数化:与非负矩阵因数化和进化算法无监管和混合方法 (Subspace Learning for Feature Selection via Rank Revealing QR Factorization: Unsupervised and Hybrid Approaches with Non-negative Matrix Factorization and Evolutionary Algorithm)

2022 年 10 月 2 日

Subspace Learning for Feature Selection via Rank Revealing QR Factorization: Unsupervised and Hybrid Approaches with Non-negative Matrix Factorization and Evolutionary Algorithm

翻译：通过Revealing QR因数化:与非负矩阵因数化和进化算法无监管和混合方法

Amir Moslemi,Arash Ahmadian

from arxiv, 34 pages, 10 figures, 4 tables

The selection of most informative and discriminative features from high-dimensional data has been noticed as an important topic in machine learning and data engineering. Using matrix factorization-based techniques such as nonnegative matrix factorization for feature selection has emerged as a hot topic in feature selection. The main goal of feature selection using matrix factorization is to extract a subspace which approximates the original space but in a lower dimension. In this study, rank revealing QR (RRQR) factorization, which is computationally cheaper than singular value decomposition (SVD), is leveraged in obtaining the most informative features as a novel unsupervised feature selection technique. This technique uses the permutation matrix of QR for feature selection which is a unique property to this factorization method. Moreover, QR factorization is embedded into non-negative matrix factorization (NMF) objective function as a new unsupervised feature selection method. Lastly, a hybrid feature selection algorithm is proposed by coupling RRQR, as a filter-based technique, and a Genetic algorithm as a wrapper-based technique. In this method, redundant features are removed using RRQR factorization and the most discriminative subset of features are selected using the Genetic algorithm. The proposed algorithm shows to be dependable and robust when compared against state-of-the-art feature selection algorithms in supervised, unsupervised, and semi-supervised settings. All methods are tested on seven available microarray datasets using KNN, SVM and C4.5 classifiers. In terms of evaluation metrics, the experimental results shows that the proposed method is comparable with the state-of-the-art feature selection.

翻译：在机器学习和数据工程中,人们注意到从高维数据中选择信息最丰富和最具歧视性的特征是一个重要议题。使用基于矩阵的因子化技术,例如用于特征选择的非负式矩阵因子化,在特征选择中作为一个热题出现。使用矩阵因子化的主要目的,是提取一个与原始空间相近但在较低层面的子空间。在本研究中,通过计算比单值分解(SVD)更廉价的 QR(RRQR)因子化,在获得最基于信息的特点选择技术(新颖的、不受监督的特性选择技术)时,利用基于矩阵的因子化技术,这种技术在选择特征时使用QRR(QR)的变异性矩阵。此外,QR(QR)因不具有新的不受监督的特性选择功能,因此,在采用基于过滤技术的混合RRRRR(S)和遗传算法作为基于包装的技术。在这一方法中,在采用最具有可比性的内,在使用可控制的内级的内,所有可变的内,在使用可比较的CRRLA值选择的内,所有可比较的因子特性特性显示的可比较的可比较的可变的可变式的Sq等的内,将显示的内,所有可比较的可比较的可变的可变式的可变式的SqLIFS。

相关内容

特征选择

关注 0

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日