Recent face generation methods have tried to synthesize faces based on the given contour condition, like a low-resolution image or sketch. However, the problem of identity ambiguity remains unsolved, which usually occurs when the contour is too vague to provide reliable identity information (e.g., when its resolution is extremely low). Thus feasible solutions of image restoration could be infinite. In this work, we propose a novel framework that takes the contour and an extra image specifying the identity as the inputs, where the contour can be of various modalities, including the low-resolution image, sketch, and semantic label map. Concretely, we propose a novel dual-encoder architecture, in which an identity encoder extracts the identity-related feature, accompanied by a main encoder to obtain the rough contour information and further fuse all the information together. The encoder output is iteratively fed into a pre-trained StyleGAN generator until getting a satisfying result. To the best of our knowledge, this is the first work that achieves identity-guided face generation conditioned on multi-modal contour images. Moreover, our method can produce photo-realistic results with 1024$\times$1024 resolution.
翻译:近代相貌方法试图根据给定的轮廓条件(如低分辨率图像或草图)合成面孔。然而,身份模糊问题仍未解决,这通常发生在轮廓过于模糊,无法提供可靠的身份信息时(例如,其分辨率极低时)。因此,图像恢复的可行解决办法可能是无限的。在这项工作中,我们提出了一个新的框架,将轮廓和额外图像作为输入来说明身份,轮廓可以是多种模式,包括低分辨率图像、草图和语义标签图。具体地说,我们提出了一个新的双编码结构,其中身份编码器提取了与身份有关的特征特征,配有一个主编码器,以获取粗度的轮廓信息并进一步将所有信息结合在一起。编码器输出被反复输入到一个经过预先训练的StyGAN发电机中,直到获得满意的结果。据我们所知,这是第一个实现身份引导面生成结果的工作,以多模式的轮廓10美元为条件。此外,我们的方法可以产生10美元分辨率的图像-制成结果。