The disentanglement of StyleGAN latent space has paved the way for realistic and controllable image editing, but does StyleGAN know anything about temporal motion, as it was only trained on static images? To study the motion features in the latent space of StyleGAN, in this paper, we hypothesize and demonstrate that a series of meaningful, natural, and versatile small, local movements (referred to as "micromotion", such as expression, head movement, and aging effect) can be represented in low-rank spaces extracted from the latent space of a conventionally pre-trained StyleGAN-v2 model for face generation, with the guidance of proper "anchors" in the form of either short text or video clips. Starting from one target face image, with the editing direction decoded from the low-rank space, its micromotion features can be represented as simple as an affine transformation over its latent feature. Perhaps more surprisingly, such micromotion subspace, even learned from just single target face, can be painlessly transferred to other unseen face images, even those from vastly different domains (such as oil painting, cartoon, and sculpture faces). It demonstrates that the local feature geometry corresponding to one type of micromotion is aligned across different face subjects, and hence that StyleGAN-v2 is indeed "secretly" aware of the subject-disentangled feature variations caused by that micromotion. We present various successful examples of applying our low-dimensional micromotion subspace technique to directly and effortlessly manipulate faces, showing high robustness, low computational overhead, and impressive domain transferability. Our codes are available at https://github.com/wuqiuche/micromotion-StyleGAN.
翻译:StyleGAN 潜伏空间的分解为现实和可控的图像编辑铺平了道路,但StyleGAN 是否了解时间运动的任何关于时间运动的指引,因为它只是用静态图像来训练的?为了研究StyleGAN 潜伏空间的动作特征,我们在本文中假设并表明一系列有意义的、自然的和多功能的小地方运动(被称为“缩略图 ”, 诸如表达式、头部移动和渐变效应等)可以在低空隙中表现为低空空间,这些空间是从一个常规的事先训练过的StyleGAN-v2 模型的潜伏空间中提取出来的,而StyleGAN 模型则以短文本或视频剪短片的形式指导适当的“内置器”。从一个目标图像开始,其编辑方向从低空空间中解码,其微动特征可以简单化,比如“缩略微移动子空间,甚至从一个目标面部,甚至从一个小目标面,可以痛苦地传递到其他不易懂的图像,甚至从巨大的不同域域域(例如油面、卡路面、卡路面,直观、直观、直观、直观、直观、直观、直观、直观、直观、直观的变变。因此展示、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观变。