Multiclass classification (MCC) is a fundamental machine learning problem which aims to classify each instance into one of a predefined set of classes. Given an instance, a classification model computes a score for each class, all of which are then used to sort the classes. The performance of a classification model is usually measured by Top-K Accuracy/Error (e.g., K=1 or 5). In this paper, we do not aim to propose new neural representation learning models as most recent works do, but to show that it is easy to boost MCC performance with a novel formulation through the lens of ranking. In particular, by viewing MCC as to rank classes for an instance, we first argue that ranking metrics, such as Normalized Discounted Cumulative Gain (NDCG), can be more informative than existing Top-K metrics. We further demonstrate that the dominant neural MCC architecture can be formulated as a neural ranking framework with a specific set of design choices. Based on such generalization, we show that it is straightforward and intuitive to leverage techniques from the rich information retrieval literature to improve the MCC performance out of the box. Extensive empirical results on both text and image classification tasks with diverse datasets and backbone models (e.g., BERT and ResNet for text and image classification) show the value of our proposed framework.
翻译:多级分类(MCC)是一个根本性的机器学习问题,目的是将每个实例分类为一组预先定义的类别之一。 举例而言, 分类模型计算每个类别的一个分, 然后所有分类都用于对类别进行排序。 分类模型的性能通常由Top- K 准确度/ 错误( 例如, K=1 或 5) 来测量( 例如 K= 1 或 5 ) 。 在本文中, 我们的目的不是像最近的工作那样提出新的神经显示学习模型, 而是要表明很容易通过排序的镜头用一种新颖的公式来提升 CMC的性能。 特别是, 通过将 CCC 显示为某类排序, 我们首先认为, 定级指标, 如普通化的分计累积增益( NDCG), 要比现有的Top- K 度衡量标准( 例如, K=1 或 5 ) 更具有更丰富的分数/ 。 我们进一步证明, 占主导地位的神经管理管理器结构可以形成一个有一套特定设计选择的神经分级框架。 基于这种概括性, 我们表明, 从丰富的信息检索文献文献文献文献文献文献检索文献和图像框架的利用技术来改进 CMC CMCREDF 。 格式的文本和图像的分类。