为提高大型数据集的分类准确性而减少多维性以提高大数据集分类精确度的等级子空间学习 (Hierarchical Subspace Learning for Dimensionality Reduction to Improve Classification Accuracy in Large Data Sets)

Manifold learning is used for dimensionality reduction, with the goal of finding a projection subspace to increase and decrease the inter- and intraclass variances, respectively. However, a bottleneck for subspace learning methods often arises from the high dimensionality of datasets. In this paper, a hierarchical approach is proposed to scale subspace learning methods, with the goal of improving classification in large datasets by a range of 3% to 10%. Different combinations of methods are studied. We assess the proposed method on five publicly available large datasets, for different eigen-value based subspace learning methods such as linear discriminant analysis, principal component analysis, generalized discriminant analysis, and reconstruction independent component analysis. To further examine the effect of the proposed method on various classification methods, we fed the generated result to linear discriminant analysis, quadratic linear analysis, k-nearest neighbor, and random forest classifiers. The resulting classification accuracies are compared to show the effectiveness of the hierarchical approach, reporting results of an average of 5% increase in classification accuracy.

翻译：使用分层学习来降低维度,目的是找到一个投影子空间,以分别增加和减少各阶级之间和内部的差异。然而,子空间学习方法的瓶颈往往来自数据集的高度维度。在本文件中,建议采用分级方法来缩小子空间学习方法的规模,目的是将大型数据集的分类率提高3%至10%。研究了各种不同的方法组合。我们评估了五个公开提供的大型数据集的拟议方法,用于不同的基于二元价值的子空间学习方法,如线性辨别分析、主要组成部分分析、普遍反射分析以及重建独立部件分析。为了进一步审查拟议方法对各种分类方法的影响,我们将所产生的结果反馈到线性分层分析、四边线分析、K-近邻和随机森林分类器中。因此,对分类方法进行了比较,以显示分级法的有效性,报告平均提高5%分类准确度的结果。