In object detection, data amount and cost are a trade-off, and collecting a large amount of data in a specific domain is labor intensive. Therefore, existing large-scale datasets are used for pre-training. However, conventional transfer learning and domain adaptation cannot bridge the domain gap when the target domain differs significantly from the source domain. We propose a data synthesis method that can solve the large domain gap problem. In this method, a part of the target image is pasted onto the source image, and the position of the pasted region is aligned by utilizing the information of the object bounding box. In addition, we introduce adversarial learning to discriminate whether the original or the pasted regions. The proposed method trains on a large number of source images and a few target domain images. The proposed method achieves higher accuracy than conventional methods in a very different domain problem setting, where RGB images are the source domain, and thermal infrared images are the target domain. Similarly, the proposed method achieves higher accuracy in the cases of simulation images to real images.
翻译:在目标检测中,数据数量和成本是权衡,在特定领域收集大量数据是劳动密集型的。因此,在培训前使用现有的大型数据集。然而,常规传输学习和域适应无法弥补域间差距,因为目标域与源域差异很大。我们建议了一种数据合成方法,可以解决大域间差距问题。在这种方法中,目标图像的一部分被粘贴到源图像上,被粘贴区域的位置通过使用对象约束框的信息来调整。此外,我们引入了对抗性学习,以区分原始或已粘贴区域。拟议的方法在大量源图像和少数目标域域图像上列队列。在非常不同的域问题设置中,RGB图像是源域域,热红外图像是目标域。同样,拟议方法在模拟图像到真实图像的情况下,其准确性更高。