Much more attention has been paid to unsupervised feature selection nowadays due to the emergence of massive unlabeled data. The distribution of samples and the latent effect of training a learning method using samples in more effective order need to be considered so as to improve the robustness of the method. Self-paced learning is an effective method considering the training order of samples. In this study, an unsupervised feature selection is proposed by integrating the framework of self-paced learning and subspace learning. Moreover, the local manifold structure is preserved and the redundancy of features is constrained by two regularization terms. $L_{2,1/2}$-norm is applied to the projection matrix, which aims to retain discriminative features and further alleviate the effect of noise in the data. Then, an iterative method is presented to solve the optimization problem. The convergence of the method is proved theoretically and experimentally. The proposed method is compared with other state of the art algorithms on nine real-world datasets. The experimental results show that the proposed method can improve the performance of clustering methods and outperform other compared algorithms.
翻译:由于出现了大量未加标记的数据,现已更多地注意未受监督的特征选择; 需要考虑样品的分布和以更有效的顺序使用样品培训学习方法的潜在影响,以提高方法的稳健性; 自行掌握学习是考虑培训顺序的有效方法; 这项研究通过结合自行掌握学习进度和次空间学习的框架,提议采用未经监督的特征选择; 此外,当地多功能结构得到保留,特性的冗余受到两个正规化条件的限制。 $L ⁇ 2,1/2}美元-Norm应用于预测矩阵,其目的是保留歧视性特征,进一步减轻数据中噪音的影响; 然后,提出一种迭代方法,以解决优化问题; 这种方法的趋同在理论上和实验上证明。 拟议的方法与九个真实世界数据集的其他艺术算法进行了比较。 实验结果显示,拟议的方法可以改进组合方法的性,并超越其他算法。