Over the past few years, image-to-image (I2I) translation methods have been proposed to translate a given image into diverse outputs. Despite the impressive results, they mainly focus on the I2I translation between two domains, so the multi-domain I2I translation still remains a challenge. To address this problem, we propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework that leverages the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance while preserving the given geometric content. We also exploit a contrast learning objective, which improves the disentanglement ability and effectively utilizes multi-domain image data in the training process by pairing the semantically similar images. This allows our method to learn the diverse mappings between multiple visual domains with only a single framework. We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
翻译:过去几年来,人们提议了图像到图像翻译方法,将特定图像转换成多种产出。尽管取得了令人印象深刻的成果,但主要侧重于两个领域之间的I2I翻译,因此多域 I2I翻译仍是一个挑战。为了解决这一问题,我们提议了一个创新的多域且不受监督的图像到图像翻译(MDUIT)框架,利用已分解的内容特征和外观适应性演进将图像转换成目标外观,同时保存给定的几何内容。我们还利用了一个对比学习目标,通过将语义相似的图像配对,在培训过程中有效地利用多域图像数据。这样,我们就能用一个单一的框架学习多视域间不同图象的图象图象。我们展示了拟议方法在多个领域与最先进的方法相比产生视觉多样性和合理的结果。