Some of the most exciting experiences that Metaverse promises to offer, for instance, live interactions with virtual characters in virtual environments, require real-time photo-realistic rendering. 3D reconstruction approaches to rendering, active or passive, still require extensive cleanup work to fix the meshes or point clouds. In this paper, we present a neural volumography technique called neural volumetric video or NeuVV to support immersive, interactive, and spatial-temporal rendering of volumetric video contents with photo-realism and in real-time. The core of NeuVV is to efficiently encode a dynamic neural radiance field (NeRF) into renderable and editable primitives. We introduce two types of factorization schemes: a hyper-spherical harmonics (HH) decomposition for modeling smooth color variations over space and time and a learnable basis representation for modeling abrupt density and color changes caused by motion. NeuVV factorization can be integrated into a Video Octree (VOctree) analogous to PlenOctree to significantly accelerate training while reducing memory overhead. Real-time NeuVV rendering further enables a class of immersive content editing tools. Specifically, NeuVV treats each VOctree as a primitive and implements volume-based depth ordering and alpha blending to realize spatial-temporal compositions for content re-purposing. For example, we demonstrate positioning varied manifestations of the same performance at different 3D locations with different timing, adjusting color/texture of the performer's clothing, casting spotlight shadows and synthesizing distance falloff lighting, etc, all at an interactive speed. We further develop a hybrid neural-rasterization rendering framework to support consumer-level VR headsets so that the aforementioned volumetric video viewing and editing, for the first time, can be conducted immersively in virtual 3D space.
翻译:Metove 承诺提供的一些最令人兴奋的经验,例如,与虚拟环境中的虚拟字符进行实时互动,需要实时的摄影现实化。 3D 重建方法对成形、 主动或被动进行改造, 仍然需要进行广泛的清理工作来修复介质或点云。 在本文中, 我们展示了一种神经活变学技术, 叫做神经活变体视频或NeuVV, 以支持以照片现实主义和实时方式对体积视频内容进行触摸、互动和空间时空转换。 NeV 的核心是有效地将一个动态神经亮度字段(NeRF) 有效地纳入可变现和可编辑的原始位置。 我们引入了两种类型的因素化方案: 超球调调色调, 模拟空间和时间的变化, 模拟瞬变异变的密度和颜色变异性变异性变异的变异性变异性变异的变异性变异性变异性变异性变异性变异性能框架。 我们的内变性变变性变性变变变变变变变变变变变的变变性变性变性变性变性变性变变性变变的变的变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变性变