Image-to-image translation is a fundamental task in computer vision. It transforms images from one domain to images in another domain so that they have particular domain-specific characteristics. Most prior works train a generative model to learn the mapping from a source domain to a target domain. However, learning such mapping between domains is challenging because data from different domains can be highly unbalanced in terms of both quality and quantity. To address this problem, we propose a new approach to extract image features by learning the similarities and differences of samples within the same data distribution via a novel contrastive learning framework, which we call Auto-Contrastive-Encoder (ACE). ACE learns the content code as the similarity between samples with the same content information and different style perturbations. The design of ACE enables us to achieve zero-shot image-to-image translation with no training on image translation tasks for the first time. Moreover, our learning method can learn the style features of images on different domains effectively. Consequently, our model achieves competitive results on multimodal image translation tasks with zero-shot learning as well. Additionally, we demonstrate the potential of our method in transfer learning. With fine-tuning, the quality of translated images improves in unseen domains. Even though we use contrastive learning, all of our training can be performed on a single GPU with the batch size of 8.
翻译:图像到图像翻译是计算机视觉的一个基本任务。 它将图像从一个域转换为另一个域的图像, 使其具有特定的域特性。 大多数先前的工作都训练了一个基因化模型, 以学习从源域到目标域的绘图。 但是, 学习这些域间的绘图具有挑战性, 因为不同域的数据在质量和数量上都高度不平衡。 为了解决这个问题, 我们提出一种新的方法, 通过一个新颖的对比性学习框架, 学习同一数据分布中样本的相似性和差异, 从而提取图像特征, 我们称之为 Aut- Constration- Encoder( ACE ) 。 ACE 学习内容代码, 将它作为样本与同一内容信息和不同风格的图象互触的相似性。 ACE 的设计使我们能够在首次没有图像翻译任务培训的情况下实现零光化图像到图像的图像翻译。 此外, 我们的学习方法可以有效地学习不同域的风格特征。 因此, 我们的模型在多式图像翻译任务上取得了竞争性的结果, 我们用的是零光谱学习的学习方法, 我们的模版化质量学习了我们学习了。 学习了G级质量。