Existing learning-based methods for point cloud rendering adopt various 3D representations and feature querying mechanisms to alleviate the sparsity problem of point clouds. However, artifacts still appear in rendered images, due to the challenges in extracting continuous and discriminative 3D features from point clouds. In this paper, we present a dense while lightweight 3D representation, named TriVol, that can be combined with NeRF to render photo-realistic images from point clouds. Our TriVol consists of triple slim volumes, each of which is encoded from the point cloud. TriVol has two advantages. First, it fuses respective fields at different scales and thus extracts local and non-local features for discriminative representation. Second, since the volume size is greatly reduced, our 3D decoder can be efficiently inferred, allowing us to increase the resolution of the 3D space to render more point details. Extensive experiments on different benchmarks with varying kinds of scenes/objects demonstrate our framework's effectiveness compared with current approaches. Moreover, our framework has excellent generalization ability to render a category of scenes/objects without fine-tuning.
翻译:孪生容器:通过三重容器进行点云渲染
现有的基于学习的点云渲染方法采用各种3D表示和特征查询机制来缓解点云的稀疏问题。然而,由于从点云中提取连续且具有区分力的3D特征存在挑战,渲染图像中仍会出现伪影。在本文中,我们提出了一种名为 TriVol 的密集又轻量的3D表示法,可以与NeRF结合使用,从点云渲染逼真的图像。我们的 TriVol 由三个瘦长的容器组成,每个容器都从点云中编码而成。TriVol 具有两个优点。首先,它将不同尺度的各自领域融合在一起,因此提取了局部和非局部特征,具有判别性的表征。其次,由于卷积大小大大降低,因此我们的3D解码器可以有效地推断,从而使我们可以提高3D空间的分辨率以渲染更多的点细节。在不同的基准测试场景/对象上进行广泛的实验,证明了我们的框架与当前方法相比的有效性。此外,我们的框架具有在不进行微调的情况下渲染场景/对象类别的出色泛化能力。