Recent years witness the tremendous success of generative adversarial networks (GANs) in synthesizing photo-realistic images. GAN generator learns to compose realistic images and reproduce the real data distribution. Through that, a hierarchical visual feature with multi-level semantics spontaneously emerges. In this work we investigate that such a generative feature learned from image synthesis exhibits great potentials in solving a wide range of computer vision tasks, including both generative ones and more importantly discriminative ones. We first train an encoder by considering the pretrained StyleGAN generator as a learned loss function. The visual features produced by our encoder, termed as Generative Hierarchical Features (GH-Feat), highly align with the layer-wise GAN representations, and hence describe the input image adequately from the reconstruction perspective. Extensive experiments support the versatile transferability of GH-Feat across a range of applications, such as image editing, image processing, image harmonization, face verification, landmark detection, layout prediction, image retrieval, etc. We further show that, through a proper spatial expansion, our developed GH-Feat can also facilitate fine-grained semantic segmentation using only a few annotations. Both qualitative and quantitative results demonstrate the appealing performance of GH-Feat.
翻译:近些年来,基因对抗网络(GANs)在合成摄影现实图像方面取得了巨大成功。 GAN 生成器学会了制作现实的图像并复制真实的数据分布。 通过这个方法,一个具有多层次语义的分级视觉特征自发出现。 在这项工作中,我们调查从图像合成中学会的这种基因化特征在解决广泛的计算机视觉任务(包括基因化任务和更为重要的歧视任务)方面表现出巨大的潜力。我们首先将预先训练过的SteleGAN 生成器视为一个学习的丢失功能来训练一个编码器。我们的编码器(称为GH-Feat)所制作的视觉特征(GH-Feat)与分层的GAN图象高度一致,从而从重建的角度充分描述输入图像。广泛的实验支持GH-F在一系列应用中,例如图像编辑、图像处理、图像协调、脸部验证、标志性检测、布局预测、图像检索等,我们进一步显示,通过适当的空间扩展,我们开发的GH-Feat 的图像特征特征特征(GH-Fat) 也只能用精准的图像部分展示高分辨率。