【泡泡一分钟】VegFru:用于细粒度视觉分类的特定领域的数据集(ICCV2017-53)

会员服务 ·

【泡泡一分钟】VegFru:用于细粒度视觉分类的特定领域的数据集(ICCV2017-53)

2018 年 7 月 18 日 泡泡机器人SLAM

每天一分钟，带你读遍机器人顶级会议文章

标题：VegFru: A Domain-Specific Dataset for Fine-grained Visual Categorization

作者：Saihui Hou, Yushan Feng, Zilei Wang

来源：International Conference on Computer Vision (ICCV 2017)

播音员：郭晨

编译：杨雨生(57)

欢迎个人转发朋友圈；其他机构或自媒体如需转载，后台留言申请授权

摘要

本文中，作者提出一种用于细粒度视觉分类（FGVC）的，针对特定领域的数据集-VegFru。现有的用于细粒度分类的数据集主要是关于动物品种或者具有有限标签标注的人造物体，而VegFru 是一个关于水果和蔬菜的大型数据集，其数据内容，即水果和蔬菜，与人们每天的生活密切相关。作者主要着眼于国内烹饪与食品管理领域，所以VegFru数据集根据饮食特性对蔬菜和水果进行了分类，并且每张图片至少包含蔬菜或水果中可食用的一部分。特别地，数据集中的所有图都具有多层级的标签。

在当前版本的数据集中，将蔬菜和水果分成了25个上层类，和292个子类。整个数据集超过160000张图片，并且每一个子集至少包含200张图片。在提供数据集的同时，作者同时提供了一个名为HybridNet的框架，用于提取细粒度视觉分类中的层级标签。具体的，首先对多层级的标签分别进行处理，提取出多种粒度特征，然后利用这些特征来进行一些操作。作者在VegFru、FGVC-Aircraft、CUB-200-3011等数据集上，对HybridNet进行了测试，均取得了较好的表现效果。

作者已经将数据集和代码全部进行了开源，其网址为：https://github.com/ustc-vim/vegfru

下面展示了数据库中的一部分内容

Abstract

In this paper, we propose a novel domain-specific dataset named VegFru for fine-grained visual categorization(FGVC). While the existing datasets for FGVC are mainly focused on animal breeds or man-made objects with limited labelled data, VegFru is a larger dataset consisting of vegetables and fruits which are closely associated with the daily life of everyone. Aiming at domestic cooking and food management, VegFru categorizes vegetables and fruits according to their eating characteristics, and each image contains at least one edible part of vegetables or fruits with the same cooking usage. Particularly, all the images are labelled hierarchically. The current version covers vegetables and fruits of 25 upper-level categories and 292 subordinate classes. And it contains more than 160,000 images in total and at least 200 images for each subordinate class. Accompanying the dataset, we also propose an effective framework called HybridNet to exploit the label hierarchy for FGVC. Specifically, multiple granularity features are first extracted by dealing with the hierarchical labels separately. And then they are fused through explicit operation, e.g., Compact Bilinear Pooling, to form a unified representation for the ultimate recognition. The experimental results on the novel VegFru, the public FGVC-Aircraft and CUB-200-2011 indicate that HybridNet achieves one of the top performance on these datasets. The dataset and code are available at https://github.com/ustc-vim/vegfru.