This work targets at using a general deep learning framework to synthesize free-viewpoint images of arbitrary human performers, only requiring a sparse number of camera views as inputs and skirting per-case fine-tuning. The large variation of geometry and appearance, caused by articulated body poses, shapes and clothing types, are the key bottlenecks of this task. To overcome these challenges, we present a simple yet powerful framework, named Generalizable Neural Performer (GNR), that learns a generalizable and robust neural body representation over various geometry and appearance. Specifically, we compress the light fields for novel view human rendering as conditional implicit neural radiance fields from both geometry and appearance aspects. We first introduce an Implicit Geometric Body Embedding strategy to enhance the robustness based on both parametric 3D human body model and multi-view images hints. We further propose a Screen-Space Occlusion-Aware Appearance Blending technique to preserve the high-quality appearance, through interpolating source view appearance to the radiance fields with a relax but approximate geometric guidance. To evaluate our method, we present our ongoing effort of constructing a dataset with remarkable complexity and diversity. The dataset GeneBody-1.0, includes over 360M frames of 370 subjects under multi-view cameras capturing, performing a large variety of pose actions, along with diverse body shapes, clothing, accessories and hairdos. Experiments on GeneBody-1.0 and ZJU-Mocap show better robustness of our methods than recent state-of-the-art generalizable methods among all cross-dataset, unseen subjects and unseen poses settings. We also demonstrate the competitiveness of our model compared with cutting-edge case-specific ones. Dataset, code and model will be made publicly available.
翻译:这项工作的目标是使用一个通用的深层学习框架来综合任意人类表演者的自由视野图像, 只需要很少的摄像视图作为投入, 并绕过每个大小的微调。 由清晰的体形、 形状和服装类型导致的几何和外观的巨大变异是这项任务的关键瓶颈。 为了克服这些挑战, 我们提出了一个简单而有力的框架, 名为通用神经表演者( GNR), 以各种几何和外观来学习一个普遍和强大的神经机体代表。 具体地说, 我们压缩光学场, 以便从几何和外观两个方面将人类的新的外观视为有条件的隐含神经亮色场。 我们首先引入一个隐含不透视的体外观嵌战略, 以3D人类身体模型和多视图图像提示为基础, 增强强度的强度。 我们进一步提议一个名为Geop- Sclob- clocion- Award Appear Apple 技术来保持高品质的外观, 透视源外观, 和近似几何制导。 为了评估我们的方法, 我们展示了我们当前的方法, 我们展示了最新的直观的直观的直观的正方的正方的机结构, 和直观的外观, 展示了我们目前的努力, 展示了比的模型的模型的外观的外观的外观, 展示了比的模型的变的模型的模型的外观的外观, 展示了我们方的变的变式的外观的外观, 展示了一个不同式的外观的外观, 。