Arbitrary style transfer (AST) transfers arbitrary artistic styles onto content images. Despite the recent rapid progress, existing AST methods are either incapable or too slow to run at ultra-resolutions (e.g., 4K) with limited resources, which heavily hinders their further applications. In this paper, we tackle this dilemma by learning a straightforward and lightweight model, dubbed MicroAST. The key insight is to completely abandon the use of cumbersome pre-trained Deep Convolutional Neural Networks (e.g., VGG) at inference. Instead, we design two micro encoders (content and style encoders) and one micro decoder for style transfer. The content encoder aims at extracting the main structure of the content image. The style encoder, coupled with a modulator, encodes the style image into learnable dual-modulation signals that modulate both intermediate features and convolutional filters of the decoder, thus injecting more sophisticated and flexible style signals to guide the stylizations. In addition, to boost the ability of the style encoder to extract more distinct and representative style signals, we also introduce a new style signal contrastive loss in our model. Compared to the state of the art, our MicroAST not only produces visually superior results but also is 5-73 times smaller and 6-18 times faster, for the first time enabling super-fast (about 0.5 seconds) AST at 4K ultra-resolutions. Code is available at https://github.com/EndyWon/MicroAST.
翻译:任意的风格转换( AST) 将任意的艺术风格传递到内容图像上。 尽管最近取得了快速的进展, 现有的 AST 方法要么无法或过于缓慢, 无法在超分辨率( 例如, 4K) 上运行, 从而严重妨碍其进一步应用。 在本文中, 我们通过学习一个简单和轻量的模型, 被封的 MicroAST 来解决这一两难。 关键的洞察力是完全放弃使用累赘的预加工深革命神经网络( 例如, VGG) 来推断。 相反, 我们设计了两个微型编码器( 调和 风格编码器) 和一个用于风格传输的微解解码。 内容编码器的目的是要提取内容图像图像的主要结构。 风格编码器, 加上一个调和轻轻轻轻轻的模型, 将样式图像编码化成可学习的双重调控信号, 调导解码器的中间特性和革命神经过滤器( 例如, 注入更精密和灵活的风格信号来引导结构化。 此外, 增强Sent ST Ex- ST 的超级解解码转换的模型的模型的模型的缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩算算法 。