The advent of deep learning has led to significant progress in monocular human reconstruction. However, existing representations, such as parametric models, voxel grids, meshes and implicit neural representations, have difficulties achieving high-quality results and real-time speed at the same time. In this paper, we propose Fourier Occupancy Field (FOF), a novel powerful, efficient and flexible 3D representation, for monocular real-time and accurate human reconstruction. The FOF represents a 3D object with a 2D field orthogonal to the view direction where at each 2D position the occupancy field of the object along the view direction is compactly represented with the first few terms of Fourier series, which retains the topology and neighborhood relation in the 2D domain. A FOF can be stored as a multi-channel image, which is compatible with 2D convolutional neural networks and can bridge the gap between 3D geometries and 2D images. The FOF is very flexible and extensible, e.g., parametric models can be easily integrated into a FOF as a prior to generate more robust results. Based on FOF, we design the first 30+FPS high-fidelity real-time monocular human reconstruction framework. We demonstrate the potential of FOF on both public dataset and real captured data. The code will be released for research purposes.
翻译:深层学习的到来导致单体人类重建的显著进展,然而,现有的表现,如参数模型、 voxel 电网、 meshes 和隐性神经表示等,难以同时取得高质量的结果和实时速度。在本文件中,我们提议Fourier Occupacy Fire(FOF),这是一个全新的强力、高效和灵活的3D代表场,用于单体实时和准确的人类重建。FOF代表一个3D对象,其字段为2D, 直角为视图方向,即每个2D的占用场在视图方向上的位置与Fourier系列的最初几个条件相近,在2D领域保持地形和相邻关系。FOF可以存储一个多声波图像,与2D革命神经网络兼容,可以弥合3D的实时实时和2D图像之间的差距。FOF非常灵活和可扩展,例如,在产生真实的FOFF结果之前,可以很容易地将方位模型纳入FOFFE的第一个方向,这是更牢固的首个条件。