Recent face generation methods have tried to synthesize faces based on the given contour condition, like a low-resolution image or a sketch. However, the problem of identity ambiguity remains unsolved, which usually occurs when the contour is too vague to provide reliable identity information (e.g., when its resolution is extremely low). In this work, we propose a framework that takes the contour and an extra image specifying the identity as the inputs, where the contour can be of various modalities, including the low-resolution image, sketch, and semantic label map. This task especially fits the situation of tracking the known criminals or making intelligent creations for entertainment. Concretely, we propose a novel dual-encoder architecture, in which an identity encoder extracts the identity-related feature, accompanied by a main encoder to obtain the rough contour information and further fuse all the information together. The encoder output is iteratively fed into a pre-trained StyleGAN generator until getting a satisfying result. To the best of our knowledge, this is the first work that achieves identity-guided face generation conditioned on multi-modal contour images. Moreover, our method can produce photo-realistic results with 1024$\times$1024 resolution. Code will be available at https://git.io/Jo4yh.
翻译:近代人的方法试图根据给定的轮廓状况(如低分辨率图像或草图)合成面孔,例如低分辨率图像或草图。然而,身份模糊问题仍未解决,通常发生在轮廓过于模糊,无法提供可靠的身份信息时(例如,分辨率极低时),身份模糊问题通常会发生。在这项工作中,我们提议了一个框架,以轮廓为轮廓和额外图像,指定身份作为输入,使轮廓可以包含各种模式,包括低分辨率图像、草图和语义标签图。这一任务特别适合追踪已知罪犯或为娱乐创造智能产品的情况。具体地说,我们提议了一个新型双编码结构,其中身份编码器提取了与身份有关的特点,配有主编码器,以获取粗色的轮廓信息,并进一步将所有信息连接在一起。在获得满意结果之前,将诱导的StyGAN发电机输出以迭代谢方式进行。我们最了解的是,这是第一个在多式24号图像中实现身份定位- 面制价$MLAs 10 的图像将生成10 。