There exist many high-dimensional data in real-world applications such as biology, computer vision, and social networks. Feature selection approaches are devised to confront with high-dimensional data challenges with the aim of efficient learning technologies as well as reduction of models complexity. Due to the hardship of labeling on these datasets, there are a variety of approaches on feature selection process in an unsupervised setting by considering some important characteristics of data. In this paper, we introduce a novel unsupervised feature selection approach by applying dictionary learning ideas in a low-rank representation. Dictionary learning in a low-rank representation not only enables us to provide a new representation, but it also maintains feature correlation. Then, spectral analysis is employed to preserve sample similarities. Finally, a unified objective function for unsupervised feature selection is proposed in a sparse way by an $\ell_{2,1}$-norm regularization. Furthermore, an efficient numerical algorithm is designed to solve the corresponding optimization problem. We demonstrate the performance of the proposed method based on a variety of standard datasets from different applied domains. Our experimental findings reveal that the proposed method outperforms the state-of-the-art algorithm.
翻译:在生物学、计算机视觉和社会网络等现实世界应用中存在许多高维数据。 特选方法旨在应对高维数据挑战,目的是高效学习技术和降低模型复杂性。 由于在这些数据集上贴标签的难度,在不受监督的环境中,通过考虑数据的某些重要特点,对特征选择程序有多种方法。 在本文中,我们采用一种新的不受监督的特征选择方法,在低级别代表中应用字典学习理念。 在低级别代表中进行词典学习不仅使我们能够提供新的代表性,而且还保持特征关联性。然后,利用光谱分析来保存样本相似性。最后,以稀疏的方式提出了未受监督特征选择的统一目标功能,由$\ell2,1美元-调控规范。此外,高效的数字算法旨在解决相应的优化问题。我们展示了基于不同应用领域各种标准数据集的拟议方法的绩效。我们的实验结果显示,拟议方法超越了状态算法。