Recent studies have shown remarkable success in image-to-image translation for two domains. However, existing approaches have limited scalability and robustness in handling more than two domains, since different models should be built independently for every pair of image domains. To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model. Such a unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network. This leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain. We empirically demonstrate the effectiveness of our approach on a facial attribute transfer and a facial expression synthesis tasks.
翻译:最近的研究显示,在图像到图像两个域的图像翻译方面取得了显著成功,然而,现有办法在处理两个以上域时的可缩放性和稳健性有限,因为不同的模型应该独立地为每一对图像域建立。为了应对这一局限性,我们提议StarGAN,这是一个新颖和可缩放的方法,可以仅使用一个单一模型对多个域进行图像到图像翻译。StarGAN的这种统一模型结构允许在一个网络内同时培训多个不同域的多数据集。这导致StarGAN的翻译图像与现有模型相比质量更高,以及将输入图像灵活翻译到任何理想目标域的新能力。我们通过实验性的方式展示了我们关于面部属性转换和面部表达合成任务的方法的有效性。