Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, while more recent works promote one-to-many mapping to boost diversity of the translated images. Motivated by scientific simulation and one-to-one needs, this work revisits the classic CycleGAN framework and boosts its performance to outperform more contemporary models without relaxing the cycle-consistency constraint. To achieve this, we equip the generator with a Vision Transformer (ViT) and employ necessary training and regularization techniques. Compared to previous best-performing models, our model performs better and retains a strong correlation between the original and translated image. An accompanying ablation study shows that both the gradient penalty and self-supervised pre-training are crucial to the improvement. To promote reproducibility and open science, the source code, hyperparameter configurations, and pre-trained model are available at https://github.com/LS4GAN/uvcgan.
翻译:在艺术、设计和科学模拟中,未受重视的图像到图像翻译具有广泛的艺术、设计和科学模拟应用。早期的一个突破是CyberGAN,它强调通过基因对抗网络(GAN)和周期一致性制约,在两个未受重视的图像领域之间进行一对一的绘图,而最近更多的作品则促进一对一的绘图,以促进所翻译图像的多样性。受科学模拟和一对一需要的驱动,这项工作重新审视经典的CyclyGAN框架,将其性能提升到比现代模型更优的模型,而不放松周期一致性的制约。为了实现这一点,我们为生成者配备了一个视野变异器(VIT),并使用必要的培训和正规化技术。与以往的最佳模型相比,我们的模型表现更好,并在原始图像和已翻译图像之间保持密切的关联性。伴随的通货膨胀研究表明,梯度罚和自强的预先训练对于改进至关重要。为了促进再生和开放科学、源码、超参数配置和预训练模型,可在 https://Ggivath/ANub.com查阅。