Most existing style transfer methods follow the assumption that styles can be represented with global statistics (e.g., Gram matrices or covariance matrices), and thus address the problem by forcing the output and style images to have similar global statistics. An alternative is the assumption of local style patterns, where algorithms are designed to swap similar local features of content and style images. However, the limitation of these existing methods is that they neglect the semantic structure of the content image which may lead to corrupted content structure in the output. In this paper, we make a new assumption that image features from the same semantic region form a manifold and an image with multiple semantic regions follows a multi-manifold distribution. Based on this assumption, the style transfer problem is formulated as aligning two multi-manifold distributions and a Manifold Alignment based Style Transfer (MAST) framework is proposed. The proposed framework allows semantically similar regions between the output and the style image share similar style patterns. Moreover, the proposed manifold alignment method is flexible to allow user editing or using semantic segmentation maps as guidance for style transfer. To allow the method to be applicable to photorealistic style transfer, we propose a new adaptive weight skip connection network structure to preserve the content details. Extensive experiments verify the effectiveness of the proposed framework for both artistic and photorealistic style transfer. Code is available at https://github.com/NJUHuoJing/MAST.
翻译:多数现有风格传输方法遵循的假设是,样式可以代表全球统计数据(例如,格拉姆矩阵或变量矩阵),从而通过迫使输出和样式图像具有类似的全球统计数据来解决问题。另一种办法是假设本地样式模式,即算法设计可以互换内容和样式图像的类似本地特征。然而,这些现有方法的局限性是,它们忽视了内容图像的语义结构,可能导致产出中的内容结构出现腐败。在本文中,我们作出新的假设,即同一语义区域图像构成一个多元体,多语义区域图像的多功能分布。基于这一假设,风格传输问题被描述为将两个多功能分布和基于模式图像图像图像的调和调和调和(MAST)图像转换(MAST)框架。拟议框架允许输出和风格图像图像图像结构之间在语义上相似的区域出现相似的结构。此外,拟议的多重校正校正方法可以允许用户编辑或使用语义分割图作为样式传输指南。允许将光质格式/图像格式格式转换方法适用于光质格式格式传输。