In this paper, we perform an in-depth study of the properties and applications of aligned generative models. We refer to two models as aligned if they share the same architecture, and one of them (the child) is obtained from the other (the parent) via fine-tuning to another domain, a common practice in transfer learning. Several works already utilize some basic properties of aligned StyleGAN models to perform image-to-image translation. Here, we perform the first detailed exploration of model alignment, also focusing on StyleGAN. First, we empirically analyze aligned models and provide answers to important questions regarding their nature. In particular, we find that the child model's latent spaces are semantically aligned with those of the parent, inheriting incredibly rich semantics, even for distant data domains such as human faces and churches. Second, equipped with this better understanding, we leverage aligned models to solve a diverse set of tasks. In addition to image translation, we demonstrate fully automatic cross-domain image morphing. We further show that zero-shot vision tasks may be performed in the child domain, while relying exclusively on supervision in the parent domain. We demonstrate qualitatively and quantitatively that our approach yields state-of-the-art results, while requiring only simple fine-tuning and inversion.
翻译:在本文中,我们对相匹配的基因模型的属性和应用进行了深入的研究。 我们把两种模型称为对齐的模型,如果它们具有相同的结构,我们把两种模型称为对齐的模型,其中一种模型(儿童)是通过微调从另一个(父母)获得的,通过微调到另一个领域,一种常见的转移学习做法。一些作品已经利用了对齐的StyGAN模型的一些基本特性来进行图像到图像翻译。在这里,我们第一次详细探索了模型对齐的特性和应用,同时也侧重于StyGAN。首先,我们实验性地分析了对齐的模型,并回答了关于这些模型性质的重要问题。特别是,我们发现儿童模型的潜在空间与父母的功能一致,继承了极其丰富的语义学,即使是对于遥远的数据领域,例如人类的脸和教堂也是如此。第二,由于有了这种更好的理解,我们利用了对齐的模型来解决一系列不同的任务。除了图像翻译之外,我们还展示了完全自动的跨面图像变形。我们进一步显示,在儿童领域可能执行零光的视觉任务,而只依靠母域的监督。我们展示了在简单、质量和数量上调整了我们的产出。我们展示了要求的产量。