We are witnessing an explosion of neural implicit representations in computer vision and graphics. Their applicability has recently expanded beyond tasks such as shape generation and image-based rendering to the fundamental problem of image-based 3D reconstruction. However, existing methods typically assume constrained 3D environments with constant illumination captured by a small set of roughly uniformly distributed cameras. We introduce a new method that enables efficient and accurate surface reconstruction from Internet photo collections in the presence of varying illumination. To achieve this, we propose a hybrid voxel- and surface-guided sampling technique that allows for more efficient ray sampling around surfaces and leads to significant improvements in reconstruction quality. Further, we present a new benchmark and protocol for evaluating reconstruction performance on such in-the-wild scenes. We perform extensive experiments, demonstrating that our approach surpasses both classical and neural reconstruction methods on a wide variety of metrics.
翻译:我们目睹了计算机视觉和图形中神经隐含的表达方式的爆炸,其适用性最近有所扩大,超越了形状生成和图像制成等任务,进而解决了基于图像的3D重建这一根本问题;然而,现有方法通常假定受限制的3D环境,不断被一组分散的小型摄像头所捕捉。我们引入了一种新的方法,以便在不同照明的情况下,从因特网照片收集中高效和准确地进行地面重建。为了实现这一目标,我们提议采用混合的 voxel 和表面制导的取样技术,以便能够在表面进行更有效的光采样,并导致重建质量的显著改善。此外,我们提出了新的基准和规程,用以评价在这种边缘场景上的重建绩效。我们进行了广泛的实验,表明我们的方法超越了在各种计量上的传统和神经重建方法。