Style transfer usually refers to the task of applying color and texture information from a specific style image to a given content image while preserving the structure of the latter. Here we tackle the more generic problem of semantic style transfer: given two unpaired collections of images, we aim to learn a mapping between the corpus-level style of each collection, while preserving semantic content shared across the two domains. We introduce XGAN ("Cross-GAN"), a dual adversarial autoencoder, which captures a shared representation of the common domain semantic content in an unsupervised way, while jointly learning the domain-to-domain image translations in both directions. We exploit ideas from the domain adaptation literature and define a semantic consistency loss which encourages the model to preserve semantics in the learned embedding space. We report promising qualitative results for the task of face-to-cartoon translation. The cartoon dataset we collected for this purpose is in the process of being released as a new benchmark for semantic style transfer.
翻译:样式传输通常是指将颜色和纹理信息从特定样式图像应用到特定内容图像的任务,同时保留后者的结构。 我们在这里处理语义风格传输这一更为通用的问题: 给两个未受偏移的图像收藏, 我们的目标是在每种收藏的物理级风格之间学习映射, 同时保存两个域共享的语义内容。 我们引入了 XGAN (“ Cross- GAN ” ), 这是一种双对称自动编码, 它以不受监督的方式捕捉共同域名语义内容的共同表达方式, 同时在两个方向共同学习域对域的图像转换。 我们利用了域适应文献中的想法, 并定义了一种语义一致性损失, 从而鼓励模型在所学的嵌入空间中保存语义。 我们报告了面对卡通翻译任务的质量结果。 我们为此收集的漫画数据集正在作为语义风格传输的新基准发布过程中 。