Despite significant advances in image-to-image (I2I) translation with generative adversarial networks (GANs), it remains challenging to effectively translate an image to a set of diverse images in multiple target domains using a pair of generators and discriminators. Existing multimodal I2I translation methods adopt multiple domain-specific content encoders for different domains, where each domain-specific content encoder is trained with images from the same domain only. Nevertheless, we argue that the content (domain-invariance) features should be learned from images among all of the domains. Consequently, each domain-specific content encoder of existing schemes fails to extract the domain-invariant features efficiently. To address this issue, we present a flexible and general SoloGAN model for efficient multimodal I2I translation among multiple domains with unpaired data. In contrast to existing methods, the SoloGAN algorithm uses a single projection discriminator with an additional auxiliary classifier and shares the encoder and generator for all domains. As such, the SoloGAN model can be trained effectively with images from all domains so that the domain-invariance content representation can be efficiently extracted. Qualitative and quantitative results over a wide range of datasets against several counterparts and variants of the SoloGAN model demonstrate the merits of the method, especially for challenging I2I translation tasks, i.e. tasks that involve extreme shape variations or need to keep the complex backgrounds unchanged after translations. Furthermore, we demonstrate the contribution of each component using ablation studies.
翻译:尽管在图像到图像(I2I)翻译方面有显著进步,并配有基因对抗网络(GANs),但将图像有效转换成多目标域的一组不同图像仍具有挑战性。现有的多式I2I翻译方法采用不同域的多域特定内容编码器,每个特定域的内容编码器都只用同一域的图像进行培训。然而,我们主张,内容(常年差异)特征应当从所有域的图象中学习。因此,每个现有方案的具体域内容编码器都无法有效地提取出多种目标域域的不同变异特征。为了解决这一问题,我们提出了一个灵活和通用的SoloGAN模型模型,用于在多域中使用未覆盖数据的多域中使用多域化的多域化 I2I 内容编码编码器。与现有方法相比,SoloGAN 算法使用一个单一的投影分析器,与额外的辅助分类器共享所有域域的编码和生成器。因此,SoloGAN模型可以有效地与所有域域的图像培训,这样可以有效地从所有域域域中提取域变异性内容的域和变异性分析结果,从而显示我们域变式内容的变式分析结果,从而需要多种变异性研究。具体地显示各种数据的计算结果。