Recently, synthesizing personalized characters from a single user-given portrait has received remarkable attention as a drastic popularization of social media and the metaverse. The input image is not always in frontal view, thus it is important to acquire or predict canonical view for 3D modeling or other applications. Although the progress of generative models enables the stylization of a portrait, obtaining the stylized image in canonical view is still a challenging task. There have been several studies on face frontalization but their performance significantly decreases when input is not in the real image domain, e.g., cartoon or painting. Stylizing after frontalization also results in degenerated output. In this paper, we propose a novel and unified framework which generates stylized portraits in canonical view. With a proposed latent mapper, we analyze and discover frontalization mapping in a latent space of StyleGAN to stylize and frontalize at once. In addition, our model can be trained with unlabelled 2D image sets, without any 3D supervision. The effectiveness of our method is demonstrated by experimental results.
翻译:最近,从一个用户提供的肖像中合成个性化字符作为社交媒体和元体的急剧普及而引起人们的极大关注。输入图像并不总是在前视中,因此获取或预测3D建模或其他应用程序的孔形视图非常重要。虽然基因模型的进展使得肖像的刻板化成为了一种功能化,但在光学角度上获取元化图像仍是一项具有挑战性的任务。在脸部化方面进行了一些研究,但当输入不在真实图像领域(如漫画或绘画)时,其性能显著下降。在前视图像之后的刻录也会导致变形输出。在本文中,我们提出了一个新颖和统一的框架,在3D监督下,我们的方法的有效性通过实验结果得到证明。