Unpaired image-to-image translation using Generative Adversarial Networks (GAN) is successful in converting images among multiple domains. Moreover, recent studies have shown a way to diversify the outputs of the generator. However, since there are no restrictions on how the generator diversifies the results, it is likely to translate some unexpected features. In this paper, we propose Style-Restricted GAN (SRGAN), a novel approach to transfer input images into different domains' with different styles, changing the exclusively class-related features. Additionally, instead of KL divergence loss, we adopt 3 new losses to restrict the distribution of the encoded features: batch KL divergence loss, correlation loss, and histogram imitation loss. The study reports quantitative as well as qualitative results with Precision, Recall, Density, and Coverage. The proposed 3 losses lead to the enhancement of the level of diversity compared to the conventional KL loss. In particular, SRGAN is found to be successful in translating with higher diversity and without changing the class-unrelated features in the CelebA face dataset. Our implementation is available at https://github.com/shinshoji01/Style-Restricted_GAN.
翻译:使用 General Adversarial Networks (GAN) 将图像转换成图像的不光化图像翻译,在多个域间转换图像方面是成功的。此外,最近的研究显示,可以使生成器的输出多样化,但是,由于对发电机如何使结果多样化没有限制,因此有可能翻译出一些出乎意料的特征。在本文中,我们提议了“Style-Restricted GAN”(SRGAN),这是将输入图像传送到不同域的一种新办法,具有不同风格,改变了纯与阶级有关的特性。此外,我们采用三种新的损失来限制编码特性的分布: 批量 KL 差异损失、 相关损失 和 直观模拟损失。 研究报告报告了精度、 回调、 密度和覆盖范围的定量和定性结果。 拟议的3项损失导致与常规 KLL损失相比多样性水平的提高。 特别是, SRGANAN被认为成功地翻译了更高的多样性,并且没有改变 CelebA 脸数据集中的类不相关的特性。 我们的实施工作可在 https://Regith_ AN_G.