Dataset complexity assessment aims to predict classification performance on a dataset with complexity calculation before training a classifier, which can also be used for classifier selection and dataset reduction. The training process of deep convolutional neural networks (DCNNs) is iterative and time-consuming because of hyperparameter uncertainty and the domain shift introduced by different datasets. Hence, it is meaningful to predict classification performance by assessing the complexity of datasets effectively before training DCNN models. This paper proposes a novel method called cumulative maximum scaled Area Under Laplacian Spectrum (cmsAULS), which can achieve state-of-the-art complexity assessment performance on six datasets.
翻译:数据集复杂程度评估旨在预测在培训一个分类员之前,对一组具有复杂度的数据集进行分类,该分类员也可以用于分类选择和减少数据集。深演神经网络(DCNN)的培训过程由于超参数不确定性和不同数据集引入的域变换而具有迭代性和耗时性。因此,通过在培训DCNN模型之前有效评估数据集的复杂性来预测分类性能是有意义的。本文提出了一种新颖的方法,称为 " Laplacian Spectrrum(cmsAULS)下的累计最大缩放区 ",可实现六个数据集的最新复杂度评估性。