We present X-Avatar, a novel avatar model that captures the full expressiveness of digital humans to bring about life-like experiences in telepresence, AR/VR and beyond. Our method models bodies, hands, facial expressions and appearance in a holistic fashion and can be learned from either full 3D scans or RGB-D data. To achieve this, we propose a part-aware learned forward skinning module that can be driven by the parameter space of SMPL-X, allowing for expressive animation of X-Avatars. To efficiently learn the neural shape and deformation fields, we propose novel part-aware sampling and initialization strategies. This leads to higher fidelity results, especially for smaller body parts while maintaining efficient training despite increased number of articulated bones. To capture the appearance of the avatar with high-frequency details, we extend the geometry and deformation fields with a texture network that is conditioned on pose, facial expression, geometry and the normals of the deformed surface. We show experimentally that our method outperforms strong baselines in both data domains both quantitatively and qualitatively on the animation task. To facilitate future research on expressive avatars we contribute a new dataset, called X-Humans, containing 233 sequences of high-quality textured scans from 20 participants, totalling 35,500 data frames.
翻译:我们展示了X-Avatar, 这是一种新颖的Avatar模型, 记录了数字人类在远程现场、 AR/ VR 内外的全方位表现, 以带来像生命一样的体验。 我们的方法模型体、 手、 面部表达和外观, 可以整体地学习, 并且可以从完整的 3D 扫描或 RGB- D 数据中学习。 为此, 我们提议了一个部分觉悟的前瞻性皮肤外观模块, 可以由 SMPL- X 的参数空间驱动, 允许 X- Avatars 的直观动动画。 为了高效地学习神经形状和变形字段, 我们提出了新的部分觉悟性取样和初始化战略。 这导致更高的忠诚性结果, 特别是小部分的身体, 尽管已描述的骨骼数量增多, 但仍保持高效的培训。 为了用高频度细节捕捉到阿塔尔的外观外观外观, 我们扩展了地理测量和变形的外观网络, 允许 X- Avatars 进行表达, 我们用实验性的方法超越了在数据领域 20 的直观基线, 显示, 直观的基线, 我们从数据序列上, 上, 直观上, 我们要求了一个数据序列上, 直观的, 上, 直观的顺序上, 和直观的顺序上, 上, 直径直径直径直径直径直线, 。</s>