Weakly-supervised object detection (WSOD) models attempt to leverage image-level annotations in lieu of accurate but costly-to-obtain object localization labels. This oftentimes leads to substandard object detection and localization at inference time. To tackle this issue, we propose D2DF2WOD, a Dual-Domain Fully-to-Weakly Supervised Object Detection framework that leverages synthetic data, annotated with precise object localization, to supplement a natural image target domain, where only image-level labels are available. In its warm-up domain adaptation stage, the model learns a fully-supervised object detector (FSOD) to improve the precision of the object proposals in the target domain, and at the same time learns target-domain-specific and detection-aware proposal features. In its main WSOD stage, a WSOD model is specifically tuned to the target domain. The feature extractor and the object proposal generator of the WSOD model are built upon the fine-tuned FSOD model. We test D2DF2WOD on five dual-domain image benchmarks. The results show that our method results in consistently improved object detection and localization compared with state-of-the-art methods.
翻译:为解决这一问题,我们提议D2DF2WOD, 即一个利用合成数据、附有精确对象定位附加说明的、经过微弱监督的物体探测模型,以补充一个自然图像目标域,即仅提供图像级标签的自然图像目标域。在其暖化域适应阶段,该模型学习一个完全监督下的物体探测器(FSOD),以提高目标域目标方提案的精确度,同时学习目标目标域和探测观测方提案的特征。在其主要SWOD阶段,一个SWOD模型专门调整到目标域。SWOD模型的特征提取器和目标方提案生成器建在经过精细调的FSOD模型上。我们用五种双重目标域域定位的图像测试D2DF2WOD(FOD),同时学习目标目标域标点和探测方提案的特性和探测方标尺。我们不断改进的本地图像检测方法,结果显示现状。