Image and video synthesis has become a blooming topic in computer vision and machine learning communities along with the developments of deep generative models, due to its great academic and application value. Many researchers have been devoted to synthesizing high-fidelity human images as one of the most commonly seen object categories in daily lives, where a large number of studies are performed based on various deep generative models, task settings and applications. Thus, it is necessary to give a comprehensive overview on these variant methods on human image generation. In this paper, we divide human image generation techniques into three paradigms, i.e., data-driven methods, knowledge-guided methods and hybrid methods. For each route, the most representative models and the corresponding variants are presented, where the advantages and characteristics of different methods are summarized in terms of model architectures and input/output requirements. Besides, the main public human image datasets and evaluation metrics in the literature are also summarized. Furthermore, due to the wide application potentials, two typical downstream usages of synthesized human images are covered, i.e., data augmentation for person recognition tasks and virtual try-on for fashion customers. Finally, we discuss the challenges and potential directions of human image generation to shed light on future research.
翻译:许多研究人员致力于将高贞洁的人类图像合成为日常生活中最常见的客体类别之一,其中大量研究是根据各种深层基因模型、任务设置和应用进行的,因此,有必要全面概述这些关于人类形象生成的变异方法。在本文件中,我们将人类图像生成技术分为三种模式,即数据驱动方法、知识引导方法和混合方法。每种途径都介绍了最具代表性的模型和相应的变式,其中从模型结构和投入/产出要求的角度概述了不同方法的优势和特点。此外,还总结了文献中主要的公众人类图像数据集和评价指标。此外,由于应用潜力广泛,综合人类图像的两种典型下游用途被分为三种模式,即:数据增强,以识别个人任务和虚拟试镜形式为时尚客户。最后,我们讨论了人类未来形象的生成方向。我们讨论了人类形象的生成的挑战和潜在方向。