We explore and analyze the latent style space of StyleGAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. We first show that StyleSpace, the space of channel-wise style parameters, is significantly more disentangled than the other intermediate latent spaces explored by previous works. Next, we describe a method for discovering a large collection of style channels, each of which is shown to control a distinct visual attribute in a highly localized and disentangled manner. Third, we propose a simple method for identifying style channels that control a specific attribute, using a pretrained classifier or a small number of example images. Manipulation of visual attributes via these StyleSpace controls is shown to be better disentangled than via those proposed in previous works. To show this, we make use of a newly proposed Attribute Dependency metric. Finally, we demonstrate the applicability of StyleSpace controls to the manipulation of real images. Our findings pave the way to semantically meaningful and well-disentangled image manipulations via simple and intuitive interfaces.
翻译:我们探索并分析StyleGAN2的潜在样式空间,StyleGAN2是一个最先进的图像生成结构,它使用几个不同的数据集预先培训的模型。我们首先显示,StyleSpace,即频道风格参数的空间,比以前作品所探索的其他中间潜在空间要分解得多。接下来,我们描述一个方法,以发现大量样式频道的集合,每个频道都显示以高度本地化和分解的方式控制一个不同的视觉属性。第三,我们提出一个简单的方法,用以识别控制特定属性的样式通道,使用预先培训的分类器或少量的图像。通过这些样式空间控制对视觉属性的操纵比先前作品中提议的要好得多。要显示这一点,我们使用新提议的属性依赖度测量仪。最后,我们演示了样式空间控制对真实图像操纵的实用性。我们的发现为通过简单和直观的界面进行精准和分解的图像操纵铺平了道路。