We present the first systematic study on concealed object detection (COD), which aims to identify objects that are "perfectly" embedded in their background. The high intrinsic similarities between the concealed objects and their background make COD far more challenging than traditional object detection/segmentation. To better understand this task, we collect a large-scale dataset, called COD10K, which consists of 10,000 images covering concealed objects in diverse real-world scenarios from 78 object categories. Further, we provide rich annotations including object categories, object boundaries, challenging attributes, object-level labels, and instance-level annotations. Our COD10K is the largest COD dataset to date, with the richest annotations, which enables comprehensive concealed object understanding and can even be used to help progress several other vision tasks, such as detection, segmentation, classification, etc. Motivated by how animals hunt in the wild, we also design a simple but strong baseline for COD, termed the Search Identification Network (SINet). Without any bells and whistles, SINet outperforms 12 cutting-edge baselines on all datasets tested, making them robust, general architectures that could serve as catalysts for future research in COD. Finally, we provide some interesting findings and highlight several potential applications and future directions. To spark research in this new field, our code, dataset, and online demo are available on our project page: http://mmcheng.net/cod.
翻译:我们首次对隐蔽物体探测(COD)进行系统研究,目的是查明隐蔽物体探测(COD)的背景中“完美”嵌入的物体。隐藏物体及其背景之间的高度内在相似性使得COD比传统的物体探测/分化更具挑战性。为了更好地了解这项任务,我们收集了一个大型数据集,称为COD10K,由78个物体类别中不同现实世界情景中隐藏物体的10 000个图像组成。此外,我们提供了丰富的说明,包括对象类别、物体边界、挑战性属性、目标级标签和实例级说明。我们的COD10K是迄今最大的COD数据集,其最丰富的说明使得COD能够全面隐蔽物体的识别,甚至可以用来帮助推进其他的视觉任务,例如探测、分解、分类等。我们收集了动物在野外狩猎的方式,我们还设计了一个简单但有力的基准,称为搜索识别网络。没有发出任何铃声和口哨,SINet超越了所有被测试数据集的12个直线基线,具有最丰富的说明,具有最丰富的说明性说明,使得它们能够全面隐蔽的物体了解,甚至可以用来帮助推进其他的目录,例如我们的研究、一般结构。最后可以作为实地研究方向。