Multiclass classifiers are often designed and evaluated only on a sample from the classes on which they will eventually be applied. Hence, their final accuracy remains unknown. In this work we study how a classifier's performance over the initial class sample can be used to extrapolate its expected accuracy on a larger, unobserved set of classes. For this, we define a measure of separation between correct and incorrect classes that is independent of the number of classes: the "reversed ROC" (rROC), which is obtained by replacing the roles of classes and data-points in the common ROC. We show that the classification accuracy is a function of the rROC in multiclass classifiers, for which the learned representation of data from the initial class sample remains unchanged when new classes are added. Using these results we formulate a robust neural-network-based algorithm, "CleaneX", which learns to estimate the accuracy of such classifiers on arbitrarily large sets of classes. Unlike previous methods, our method uses both the observed accuracies of the classifier and densities of classification scores, and therefore achieves remarkably better predictions than current state-of-the-art methods on both simulations and real datasets of object detection, face recognition, and brain decoding.
翻译:多级分类器的设计和评价往往仅针对最终应用它们所属类别中的样本进行设计和评价。 因此, 最终的准确性仍然未知。 在这项工作中, 我们研究如何利用分类器在初始类别样本中的性能来推断其预期的对较大、 未观测到的一组类的准确性。 为此, 我们定义了正确和不正确的分类的分解度, 与类别数量无关: “ 反向的 ROC ” (rROC ), 这是通过在共同的 ROC 中替换类别和数据点而获得的 。 我们显示, 分类精确性是多级分类分类器中 RROC 的函数, 在添加新类别时, 最初类别样本中的数据的学习表达方式没有变化。 使用这些结果, 我们制定了一个强大的基于神经网络的算法, “ CleaneX ”, 它学会了对任意大型类类的分类器的准确性进行估计。 与以往的方法不同, 我们的方法使用了所观察到的分类器和分类分数密度的精度和密度, 因此, 与当前对目标的表面检测、 和大脑识别方法相比, 和 的大脑识别方法相比, 都比当前对正态的方法都更精确地进行了精确的预测。