Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields. However, the academic community still lacks a video dataset with diverse facial attribute annotations, which is crucial for the research on face-related videos. In this work, we propose a large-scale, high-quality, and diverse video dataset with rich facial attribute annotations, named the High-Quality Celebrity Video Dataset (CelebV-HQ). CelebV-HQ contains 35,666 video clips with the resolution of 512x512 at least, involving 15,653 identities. All clips are labeled manually with 83 facial attributes, covering appearance, action, and emotion. We conduct a comprehensive analysis in terms of age, ethnicity, brightness stability, motion smoothness, head pose diversity, and data quality to demonstrate the diversity and temporal coherence of CelebV-HQ. Besides, its versatility and potential are validated on two representative tasks, i.e., unconditional video generation and video facial attribute editing. Furthermore, we envision the future potential of CelebV-HQ, as well as the new opportunities and challenges it would bring to related research directions. Data, code, and models are publicly available. Project page: https://celebv-hq.github.io.
翻译:大型数据集在面部生成/编辑最近的成功中发挥了不可或缺的作用,大大促进了新兴研究领域的进步。然而,学术界仍然缺乏一个带有不同面部属性说明的视频数据集,这对与面部有关的视频研究至关重要。在这项工作中,我们提议建立一个规模大、质量高、内容多样的视频数据集,配有丰富的面部属性说明,名为高品质名人视频数据集(CelebV-HQ)。CelebV-HQ包含35 666个视频剪辑,至少涉及15 653个身份的512x512决议。所有剪辑都标有83个面部属性的手工标签,涵盖外观、动作和情感。我们在年龄、族裔、亮度稳定性、运动平滑、头部多样性和数据质量方面进行全面分析,以显示CeebV-HQ的多样性和时间一致性。此外,其多功能性和潜力在两项代表性任务上得到了验证,即无条件的视频生成和视频面部属性编辑。此外,我们设想CelebV-H-H的今后潜力是83个面部面部的面部面部特征,涵盖外观的面部、外观和面部数据,同时提供与数据相关的机遇。