Modeling and re-rendering dynamic 3D scenes is a challenging task in 3D vision. Prior approaches build on NeRF and rely on implicit representations. This is slow since it requires many MLP evaluations, constraining real-world applications. We show that dynamic 3D scenes can be explicitly represented by six planes of learned features, leading to an elegant solution we call HexPlane. A HexPlane computes features for points in spacetime by fusing vectors extracted from each plane, which is highly efficient. Pairing a HexPlane with a tiny MLP to regress output colors and training via volume rendering gives impressive results for novel view synthesis on dynamic scenes, matching the image quality of prior work but reducing training time by more than $100\times$. Extensive ablations confirm our HexPlane design and show that it is robust to different feature fusion mechanisms, coordinate systems, and decoding mechanisms. HexPlane is a simple and effective solution for representing 4D volumes, and we hope they can broadly contribute to modeling spacetime for dynamic 3D scenes.
翻译:在三维视觉中,对动态三维场景进行建模和重新渲染是一项具有挑战性的任务。先前的方法基于 NeRF 并依赖于隐式表示。由于需要进行许多 MLP 评估,因此速度较慢,这限制了其在实际应用中的使用。我们展示了可以通过学习的六个特征平面显式表示动态 3D 场景,从而得到一种优雅的解决方案,称为 HexPlane。HexPlane 通过融合从每个平面提取的向量来计算时空点的特征,非常高效。将 HexPlane 与一个微小的 MLP 配对,以回归输出颜色,并通过体渲染训练,在动态场景的新视角合成方面表现出令人印象深刻的结果,与先前的工作相比,匹配了图像质量,但减少了超过 100 倍的训练时间。广泛的消融实验证实了我们的 HexPlane 设计,并表明它对不同的特征融合机制、坐标系和解码机制具有鲁棒性。HexPlane 是一种简单而有效的解决方案,适用于代表四维体积,我们希望它们可以广泛地为动态 3D 场景的时空建模做出贡献。