Recently, Generative Adversarial Networks (GANs)} have been widely used for portrait image generation. However, in the latent space learned by GANs, different attributes, such as pose, shape, and texture style, are generally entangled, making the explicit control of specific attributes difficult. To address this issue, we propose a SofGAN image generator to decouple the latent space of portraits into two subspaces: a geometry space and a texture space. The latent codes sampled from the two subspaces are fed to two network branches separately, one to generate the 3D geometry of portraits with canonical pose, and the other to generate textures. The aligned 3D geometries also come with semantic part segmentation, encoded as a semantic occupancy field (SOF). The SOF allows the rendering of consistent 2D semantic segmentation maps at arbitrary views, which are then fused with the generated texture maps and stylized to a portrait photo using our semantic instance-wise (SIW) module. Through extensive experiments, we show that our system can generate high quality portrait images with independently controllable geometry and texture attributes. The method also generalizes well in various applications such as appearance-consistent facial animation and dynamic styling.
翻译:最近, General Aversarial 网络( GANs) 被广泛用于图像生成。 但是, 在 GANs 所学的潜在空间中, 不同的属性, 如形状、 形状和纹理风格等, 通常被缠绕在一起, 使得对特定属性的清晰控制变得困难。 为了解决这个问题, 我们提议 SofGAN 图像生成器, 将肖像的隐性空间分解成两个子空间: 一个几何空间和一个纹理空间。 从两个子空间取样的隐性代码被分别输入到两个网络分支, 一个用于生成有罐头的肖像的 3D 几何学, 另一个用于生成纹理。 3D 匹配的几何等属性通常被缠绕在一起, 使得对特定属性进行拼凑。 SOFOF 允许在任意视图中绘制一致的 2D 语义区段图, 然后与生成的纹理地图结合, 并用平面图模化成成像像像像, 通过广泛的实验, 我们独立地展示了各种质量的图像, 。