With the explosive growth in the number of fine-grained images in the Internet era, it has become a challenging problem to perform fast and efficient retrieval from large-scale fine-grained images. Among the many retrieval methods, hashing methods are widely used due to their high efficiency and small storage space occupation. Fine-grained hashing is more challenging than traditional hashing problems due to the difficulties such as low inter-class variances and high intra-class variances caused by the characteristics of fine-grained images. To improve the retrieval accuracy of fine-grained hashing, we propose a cascaded network to learn compact and highly semantic hash codes, and introduce an attention-guided data augmentation method. We refer to this network as a cascaded hierarchical data augmentation network. We also propose a novel approach to coordinately balance the loss of multi-task learning. We do extensive experiments on some common fine-grained visual classification datasets. The experimental results demonstrate that our proposed method outperforms several state-of-art hashing methods and can effectively improve the accuracy of fine-grained retrieval. The source code is publicly available: https://github.com/kaiba007/FG-CNET.
翻译:在互联网时代细粒度图像数量爆炸增长的背景下,快速高效的从大规模细粒度图像中检索成为一项具有挑战性的问题。哈希方法由于高效和占用小的存储空间而被广泛使用。由于细粒度图像的特点,如类间差异小、类内差异大等困难因素,细粒度哈希问题要比传统哈希问题更具挑战性。为了提高细粒度哈希的检索精度,本文提出了一种级联网络学习紧凑的高语义哈希码,并引入一种基于 attention 的数据增强方法,我们将此网络称为级联分层数据增强网络。本文还提出了一种协同平衡多任务学习损失的新方法。在几个常见的细粒度可视化分类数据集上进行了广泛的实验,并且实验结果表明,本文提出的方法优于几种现有的哈希方法,并有效地提高了细粒度检索的准确性。源代码可在 https://github.com/kaiba007/FG-CNET 找到。