Point cloud obtained from 3D scanning is often sparse, noisy, and irregular. To cope with these issues, recent studies have been separately conducted to densify, denoise, and complete inaccurate point cloud. In this paper, we advocate that jointly solving these tasks leads to significant improvement for point cloud reconstruction. To this end, we propose a deep point cloud reconstruction network consisting of two stages: 1) a 3D sparse stacked-hourglass network as for the initial densification and denoising, 2) a refinement via transformers converting the discrete voxels into 3D points. In particular, we further improve the performance of transformer by a newly proposed module called amplified positional encoding. This module has been designed to differently amplify the magnitude of positional encoding vectors based on the points' distances for adaptive refinements. Extensive experiments demonstrate that our network achieves state-of-the-art performance among the recent studies in the ScanNet, ICL-NUIM, and ShapeNetPart datasets. Moreover, we underline the ability of our network to generalize toward real-world and unmet scenes.
翻译:从 3D 扫描中获取的点云往往稀少、吵闹和不规则。 为了解决这些问题,最近分别进行了一些研究,以填充、隐蔽和完全不准确的点云。 在本文中,我们主张,共同解决这些任务可以大大改善点云重建。 为此,我们提议建立一个深点云重建网络,由两个阶段组成:(1) 3D 分散的堆叠式玻璃网络,作为初始密度和去除功能;(2) 通过变压器将离散的氧化物转换成 3D 点来改进变压器的性能。特别是,我们通过一个称为放大位置编码的新提议模块进一步改进变压器的性能。这个模块的设计是为了根据点的适应性改进距离不同地扩大定位编码矢量。广泛的实验表明,我们的网络在扫描网、ICL-NUIM 和 ShapeNetPart 数据集的最新研究中取得了最新水平的性能。 此外,我们强调我们的网络能够向现实世界和未完成的场景。