Multi-label image classification (MLIC) is a fundamental and practical task, which aims to assign multiple possible labels to an image. In recent years, many deep convolutional neural network (CNN) based approaches have been proposed which model label correlations to discover semantics of labels and learn semantic representations of images. This paper advances this research direction by improving both the modeling of label correlations and the learning of semantic representations. On the one hand, besides the local semantics of each label, we propose to further explore global semantics shared by multiple labels. On the other hand, existing approaches mainly learn the semantic representations at the last convolutional layer of a CNN. But it has been noted that the image representations of different layers of CNN capture different levels or scales of features and have different discriminative abilities. We thus propose to learn semantic representations at multiple convolutional layers. To this end, this paper designs a Multi-layered Semantic Representation Network (MSRN) which discovers both local and global semantics of labels through modeling label correlations and utilizes the label semantics to guide the semantic representations learning at multiple layers through an attention mechanism. Extensive experiments on four benchmark datasets including VOC 2007, COCO, NUS-WIDE, and Apparel show a competitive performance of the proposed MSRN against state-of-the-art models.
翻译:多标签图像分类(MLIC)是一项基本而实际的任务,目的是为图像分配多种可能的标签。近年来,提出了许多基于模型标签的模型标签,以发现标签的语义和学习图像的语义表达方式。本文通过改进标签相关性的建模和学习语义表达方式推进这一研究方向。一方面,除了每个标签的本地语义学外,我们提议进一步探索多个标签共享的全球语义学。另一方面,现有的方法主要是在CNN最后的共振层学习语义表达方式。但人们注意到,CNN不同层次的图像展示方式可以捕捉不同级别或比例的特征,并具有不同的歧视能力。因此,我们提议在多个进化层学习语义表达方式。为此,本文件设计了一个多层次的语义代表关系网络,通过建模标签关联关系关联,发现本地和全球的语义,在CNNCN-SR最后的共振动层中学习语义表达方式。在2007年的多层次上,利用CON-C-C-C-C-C-C-Simeal 测试中,包括跨层学习Settlemental-deal Streal Protistrational 演示,在2007年多层次上,包括C-Setty-Astrastrastrastrastrastraction-demental Stal Statimmal Statmal Stal Stal Stal 上,以显示Smal Statmal Stal Status) 演示演示演示机制。