PartimageNet:大型、高质量部件数据集 (PartImageNet: A Large, High-Quality Dataset of Parts)

A part-based object understanding facilitates efficient compositional learning and knowledge transfer, robustness to occlusion, and has the potential to increase the performance on general recognition and localization tasks. However, research on part-based models is hindered due to the lack of datasets with part annotations, which is caused by the extreme difficulty and high cost of annotating object parts in images. In this paper, we propose PartImageNet, a large, high-quality dataset with part segmentation annotations. It consists of 158 classes from ImageNet with approximately 24000 images. PartImageNet is unique because it offers part-level annotations on a general set of classes with non-rigid, articulated objects, while having an order of magnitude larger size compared to existing datasets. It can be utilized in multiple vision tasks including but not limited to: Part Discovery, Semantic Segmentation, Few-shot Learning. Comprehensive experiments are conducted to set up a set of baselines on PartImageNet and we find that existing works on part discovery can not always produce satisfactory results during complex variations. The exploit of parts on downstream tasks also remains insufficient. We believe that our PartImageNet will greatly facilitate the research on part-based models and their applications. The dataset and scripts will soon be released at https://github.com/TACJu/PartImageNet.

翻译：部分对象理解有助于高效的构成学习和知识转让,对封闭性的理解强健,并有可能提高一般识别和本地化任务的业绩。然而,部分基础模型的研究由于缺少带有部分说明的数据集而受阻,原因是图像中说明对象部分极其困难且成本高昂。在本文中,我们提议PartImageNet,是一个包含部分说明的大型高质量数据集。它由图像网的158个类组成,大约有24000个图像。PartImageNet是独一无二的,因为它为一组非硬化、清晰的普通类提供部分说明,但与现有数据集相比,其规模更大。它可用于多种愿景任务,包括但不限于:部分隐蔽、语义剖析、微小的学习。我们进行了全面试验,以在PartimageNet上建立一套基线。我们发现部分发现的现有部分发现工作在复杂的变异中不能总是产生令人满意的结果。在下游任务上对部分任务进行探索,同时比现有数据集大得多。我们相信,在下游任务上对部分的探索也将大大促进。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【CVPR2020】物体实例持续学习，Continual Learning of Object Instances

专知会员服务

32+阅读 · 2020年4月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日