Instance object detection plays an important role in intelligent monitoring, visual navigation, human-computer interaction, intelligent services and other fields. Inspired by the great success of Deep Convolutional Neural Network (DCNN), DCNN-based instance object detection has become a promising research topic. To address the problem that DCNN always requires a large-scale annotated dataset to supervise its training while manual annotation is exhausting and time-consuming, we propose a new framework based on co-training called Gram Self-Labeling and Detection (Gram-SLD). The proposed Gram-SLD can automatically annotate a large amount of data with very limited manually labeled key data and achieve competitive performance. In our framework, gram loss is defined and used to construct two fully redundant and independent views and a key sample selection strategy along with an automatic annotating strategy that comprehensively consider precision and recall are proposed to generate high quality pseudo-labels. Experiments on the public GMU Kitchen Dataset , Active Vision Dataset and the self-made BHID-ITEM Datasetdemonstrate that, with only 5% labeled training data, our Gram-SLD achieves competitive performance in object detection (less than 2% mAP loss), compared with the fully supervised methods. In practical applications with complex and changing environments, the proposed method can satisfy the real-time and accuracy requirements on instance object detection.
翻译:在智能监测、视觉导航、人-计算机互动、智能服务和其他领域,测出正象物体在智能监测、视觉导航、人-计算机互动、智能服务和其他领域起着重要作用。在深革命神经网络(DCNNN)的巨大成功激励下,基于DCNN的测出数据已成为一个很有希望的研究课题。为了解决DCNN总是需要一个大规模附加说明的数据集来监督其培训,而人工批注却耗尽和耗时,我们提议一个新的框架,以共同培训为基础,称为Gram-Gram自带和探测(Gram-SLD)。拟议的Gram-SLD可以自动说明大量数据,手动标记的关键数据非常有限,并实现竞争性性能。在我们的框架内,对克格拉姆-SLD定义损失进行定义,用来构建两个完全冗余的独立观点和关键的抽样选择战略,以及一个自动注解战略,全面考虑精确和回顾,以产生高质量的假标签。在公共GMMUM-K-K-Set数据集、积极视觉数据集和自制BHID-ITM数据集成演示演示中,只有5%的标定的关键数据,只有5%的标定的标定的标定的标定目标,在实际检测中,在2级测试中可以完全的测试方法,在不精确的测试中实现。