专利网:一个大型的不完整多视图、多模式、多标签工业产品图像数据库 (PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database)

In deep learning area, large-scale image datasets bring a breakthrough in the success of object recognition and retrieval. Nowadays, as the embodiment of innovation, the diversity of the industrial goods is significantly larger, in which the incomplete multiview, multimodal and multilabel are different from the traditional dataset. In this paper, we introduce an industrial goods dataset, namely PatentNet, with numerous highly diverse, accurate and detailed annotations of industrial goods images, and corresponding texts. In PatentNet, the images and texts are sourced from design patent. Within over 6M images and corresponding texts of industrial goods labeled manually checked by professionals, PatentNet is the first ongoing industrial goods image database whose varieties are wider than industrial goods datasets used previously for benchmarking. PatentNet organizes millions of images into 32 classes and 219 subclasses based on the Locarno Classification Agreement. Through extensive experiments on image classification, image retrieval and incomplete multiview clustering, we demonstrate that our PatentNet is much more diverse, complex, and challenging, enjoying higher potentials than existing industrial image datasets. Furthermore, the characteristics of incomplete multiview, multimodal and multilabel in PatentNet are able to offer unparalleled opportunities in the artificial intelligence community and beyond.

翻译：在深层学习领域,大型图像数据集在物体识别和检索的成功方面带来了突破。如今,随着创新的体现,工业产品的多样性大得多,其中不完全的多视图、多式和多标签与传统数据集不同。在本文中,我们引入了工业产品数据集,即P专利网,对工业产品图像和相应文本作了多种多样、准确和详细的说明。在专利网中,图像和文本来源于设计专利。在6M以上图像和贴有专业人员人工标签的工业产品相应文本中,专利网是第一个正在进行的工业产品图像数据库,其品种比以前用于基准的工业货物数据集要宽得多。专利网根据《洛卡诺分类协定》将数百万图像编入32个类别和219个亚类。通过对图像分类、图像检索和多视图群集的广泛实验,我们证明我们的专利网比现有的工业图像数据集更加多样化、复杂和具有挑战性,拥有更高的潜力。此外,专利网中不完全的多视角、多式和多标签的特征在人造智能中提供了前所未有的机会。