This paper studies the problem of holistic 3D wireframe perception (HoW-3D), a new task of perceiving both the visible 3D wireframes and the invisible ones from single-view 2D images. As the non-front surfaces of an object cannot be directly observed in a single view, estimating the non-line-of-sight (NLOS) geometries in HoW-3D is a fundamentally challenging problem and remains open in computer vision. We study the problem of HoW-3D by proposing an ABC-HoW benchmark, which is created on top of CAD models sourced from the ABC-dataset with 12k single-view images and the corresponding holistic 3D wireframe models. With our large-scale ABC-HoW benchmark available, we present a novel Deep Spatial Gestalt (DSG) model to learn the visible junctions and line segments as the basis and then infer the NLOS 3D structures from the visible cues by following the Gestalt principles of human vision systems. In our experiments, we demonstrate that our DSG model performs very well in inferring the holistic 3D wireframes from single-view images. Compared with the strong baseline methods, our DSG model outperforms the previous wireframe detectors in detecting the invisible line geometry in single-view images and is even very competitive with prior arts that take high-fidelity PointCloud as inputs on reconstructing 3D wireframes.
翻译:本文研究整体三维电线框架感知(HoW-3D)问题,这是一个从单一视图二D图像中看到可见的三维线框架和无形的三维图像的新任务。由于一个物体的非前沿表面无法在单一视图中直接观察,我们提出了一个全新的深视(NLOS)模型,以了解HoW-3D的非直观(NLOS)地形为基础,并在计算机愿景中仍然开放。我们通过提出ABC-HoW基准来研究HOW-3D问题,该基准是在ABC数据集、12公里单视图像和相应的整体三维电框架模型之上创建的。我们用我们大型ABC-HOW基准无法直接观察物体的非前表面表面表面表面表面表面,我们展示了一个全新的深空间Gestalt(DSG)模型,以了解可见的接线和线段作为基础,然后根据人类视觉系统Gestalt 原则从可见的线索中推导出NLOS 3D结构。在我们的实验中,我们DSG模型在整体三维线模型中非常精确地推导断了整个三维的三维图像,从先前的SBSBRBS-CSBSBS图中,将我们以前的直判前一线图比前的直线比了以前的直线图。