With the development of computational power and techniques for data collection, deep learning demonstrates a superior performance over most existing algorithms on visual benchmark data sets. Many efforts have been devoted to studying the mechanism of deep learning. One important observation is that deep learning can learn the discriminative patterns from raw materials directly in a task-dependent manner. Therefore, the representations obtained by deep learning outperform hand-crafted features significantly. However, for some real-world applications, it is too expensive to collect the task-specific labels, such as visual search in online shopping. Compared to the limited availability of these task-specific labels, their coarse-class labels are much more affordable, but representations learned from them can be suboptimal for the target task. To mitigate this challenge, we propose an algorithm to learn the fine-grained patterns for the target task, when only its coarse-class labels are available. More importantly, we provide a theoretical guarantee for this. Extensive experiments on real-world data sets demonstrate that the proposed method can significantly improve the performance of learned representations on the target task, when only coarse-class information is available for training. Code is available at \url{https://github.com/idstcv/CoIns}.
翻译:随着计算能力和数据收集技术的开发,深层次的学习表明,在视觉基准数据集方面大多数现有算法的利用表现优于大多数现有的算法。许多努力都致力于研究深层次学习的机制。一个重要的意见是深层次的学习可以直接以任务依赖的方式从原材料中学习歧视模式。因此,通过深层次学习超模手工制作的特征而获得的表述非常显著。然而,对于某些现实世界的应用,收集任务特定标签,例如网上购物的视觉搜索,成本太高。与这些特定任务标签有限相比,它们的粗皮类标签更负担得起,但从中学到的表述对于目标任务来说可能不理想。为减轻这一挑战,我们建议一种算法,在只有粗皮类标签时,学习目标任务的精细的分类模式。更重要的是,我们为此提供理论上的保证。在现实世界数据集上进行的广泛实验表明,拟议的方法可以大大改进目标任务中学习的表述的绩效,而只有可提供粗略/Cogivrs/Cogrus/comid 。