While developing perception based deep learning models, the benefit of synthetic data is enormous. However, performance of networks trained with synthetic data for certain computer vision tasks degrade significantly when tested on real world data due to the domain gap between them. One of the popular solutions in bridging this gap between synthetic and actual world data is to frame it as a domain adaptation task. In this paper, we propose and evaluate novel ways for the betterment of such approaches. In particular we build upon the method of UNIT-GAN. In normal GAN training for the task of domain translation, pairing of images from both the domains (viz, real and synthetic) is done randomly. We propose a novel method to efficiently incorporate semantic supervision into this pair selection, which helps in boosting the performance of the model along with improving the visual quality of such transformed images. We illustrate our empirical findings on Cityscapes \cite{cityscapes} and challenging synthetic dataset Synscapes. Though the findings are reported on the base network of UNIT-GAN, they can be easily extended to any other similar network.
翻译:在开发深层次学习模型的同时,合成数据的好处是巨大的,然而,由于在实际世界数据上测试,经过计算机某些视觉任务合成数据培训的网络的性能由于它们之间的领域差距而大大降低。弥合合成世界数据与实际世界数据之间这一差距的流行解决办法之一是将这种差距作为领域适应任务加以框架。我们在本文件中提出并评价改进这些方法的新办法,特别是我们利用UNIT-GAN的方法。在常规的GAN域翻译任务培训中,随机地将来自两个域(viz、真实和合成)的图像配对。我们提出了一个新颖的方法,以便有效地将语义监督纳入这一对口选择中,这有助于提高模型的性能,同时提高这些变形图像的视觉质量。我们介绍了我们对城市景点和城市景色的经验发现,并提出了挑战合成数据集。尽管这些结果是在UNIT-GAN的基础网络上报告的,但它们很容易推广到任何其他类似的网络。