We present Unified Contrastive Arbitrary Style Transfer (UCAST), a novel style representation learning and transfer framework, which can fit in most existing arbitrary image style transfer models, e.g., CNN-based, ViT-based, and flow-based methods. As the key component in image style transfer tasks, a suitable style representation is essential to achieve satisfactory results. Existing approaches based on deep neural network typically use second-order statistics to generate the output. However, these hand-crafted features computed from a single image cannot leverage style information sufficiently, which leads to artifacts such as local distortions and style inconsistency. To address these issues, we propose to learn style representation directly from a large amount of images based on contrastive learning, by taking the relationships between specific styles and the holistic style distribution into account. Specifically, we present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature. Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer. We carry out qualitative and quantitative evaluations to show that our approach produces superior results than those obtained via state-of-the-art methods.
翻译:我们提出了统一对比任意样式迁移 (UCAST),这是一种新的样式表示学习和迁移框架,可以适用于大多数现有的任意图像样式转移模型,例如基于 CNN、ViT 和流的方法。适当的样式表示是实现令人满意结果的关键组成部分。基于深度神经网络的现有方法通常使用二阶统计量生成输出。然而,这些从单张图像计算出的手工制作的特征不能充分利用样式信息,从而导致像局部畸变和样式不一致这样的瑕疵。为了解决这些问题,我们提出了一种根据对比学习直接从大量图像中学习样式表示的方法,考虑特定样式之间的关系和整体样式分布。具体而言,我们通过引入一个输入相关的温度,提出了一种自适应对比学习方案来实现样式转移。我们的框架包含三个关键组成部分,即用于样式表示和样式转移的并行对比学习方案、用于有效学习样式分布的领域增强模块以及用于样式转移的生成网络。我们进行了定性和定量评估,表明我们的方法比现有的最先进方法产生了更优越的结果。