Universal style transfer (UST) infuses styles from arbitrary reference images into content images. Existing methods, while enjoying many practical successes, are unable of explaining experimental observations, including different performances of UST algorithms in preserving the spatial structure of content images. In addition, methods are limited to cumbersome global controls on stylization, so that they require additional spatial masks for desired stylization. In this work, we provide a systematic Fourier analysis on a general framework for UST. We present an equivalent form of the framework in the frequency domain. The form implies that existing algorithms treat all frequency components and pixels of feature maps equally, except for the zero-frequency component. We connect Fourier amplitude and phase with Gram matrices and a content reconstruction loss in style transfer, respectively. Based on such equivalence and connections, we can thus interpret different structure preservation behaviors between algorithms with Fourier phase. Given the interpretations we have, we propose two manipulations in practice for structure preservation and desired stylization. Both qualitative and quantitative experiments demonstrate the competitive performance of our method against the state-of-the-art methods. We also conduct experiments to demonstrate (1) the abovementioned equivalence, (2) the interpretability based on Fourier amplitude and phase and (3) the controllability associated with frequency components.
翻译:现有方法虽然取得了许多实际的成功,但无法解释实验性观测,包括UST算法在维护内容图像空间结构方面的不同性能。此外,方法仅限于繁琐的Styliz化全球控制,因此它们需要额外的空间保护面罩以用于理想的Styliz化。在这项工作中,我们对UST的一般框架进行系统的Fourier分析。我们在频率域中呈现了同等形式的框架。形式意味着,除了零频率部分外,现有算法平等地对待特征图的所有频率组成部分和像素。我们将Fourier振幅和阶段与Gram矩阵连接起来,在风格转换中将内容重建损失。基于这种等同性和联系,我们因此可以解释Fourier阶段的算法之间不同的结构保护行为。根据我们的理解,我们提议在结构保存和理想的文体化方面进行两种操作。两种定性和定量实验都表明我们的方法对状态方法的竞争性表现,但零频率部分除外。我们还分别将Fourier 放大和频率部分的可变性试验。我们还进行了四等化试验。