Deep neural networks have become prevalent in human analysis, boosting the performance of applications, such as biometric recognition, action recognition, as well as person re-identification. However, the performance of such networks scales with the available training data. In human analysis, the demand for large-scale datasets poses a severe challenge, as data collection is tedious, time-expensive, costly and must comply with data protection laws. Current research investigates the generation of \textit{synthetic data} as an efficient and privacy-ensuring alternative to collecting real data in the field. This survey introduces the basic definitions and methodologies, essential when generating and employing synthetic data for human analysis. We conduct a survey that summarises current state-of-the-art methods and the main benefits of using synthetic data. We also provide an overview of publicly available synthetic datasets and generation models. Finally, we discuss limitations, as well as open research problems in this field. This survey is intended for researchers and practitioners in the field of human analysis.
翻译:在人类分析中,对大型数据集的需求构成严峻挑战,因为数据收集既乏味又费时,费用昂贵,而且必须符合数据保护法。目前的研究调查了生成 ktextit{合成数据的情况,作为收集实地真实数据的高效和保密的替代方法。本调查介绍了在生成和使用合成数据进行人类分析时必不可少的基本定义和方法。我们进行了一项调查,总结了目前的最新方法和使用合成数据的主要好处。我们还概述了公开提供的合成数据集和生成模型。最后,我们讨论了该领域的局限性以及公开研究问题。本调查是针对人类分析领域的研究人员和从业人员的。