We present PlanarRecon -- a novel framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video. Unlike previous works that detect planes in 2D from a single image, PlanarRecon incrementally detects planes in 3D for each video fragment, which consists of a set of key frames, from a volumetric representation of the scene using neural networks. A learning-based tracking and fusion module is designed to merge planes from previous fragments to form a coherent global plane reconstruction. Such design allows PlanarRecon to integrate observations from multiple views within each fragment and temporal information across different ones, resulting in an accurate and coherent reconstruction of the scene abstraction with low-polygonal geometry. Experiments show that the proposed approach achieves state-of-the-art performances on the ScanNet dataset while being real-time.
翻译:我们提出“PlanarRecon”——一个全球一致探测和重建3D平面的新框架,它来自一个装配的单视视频。与以前从一个图像中检测2D平面的工程不同,PlanarRecon对每个视频碎片的3D平面进行递增检测,每段视频碎片由一组关键框组成,由使用神经网络对场景的体积表示组成。一个基于学习的跟踪和聚合模块旨在将平面与以往的碎块合并,形成一个协调一致的全球飞机重建。这种设计使PlanarRecon能够整合不同碎片和时间信息中从多个观点的观测结果,从而精确和连贯地重建现场抽象和低粒形几何。实验显示,拟议的方法在实时运行时在扫描网数据集上取得了最新表现。