Structure-from-Motion (SfM) aims to recover 3D scene structures and camera poses based on the correspondences between input images, and thus the ambiguity caused by duplicate structures (i.e., different structures with strong visual resemblance) always results in incorrect camera poses and 3D structures. To deal with the ambiguity, most existing studies resort to additional constraint information or implicit inference by analyzing two-view geometries or feature points. In this paper, we propose to exploit high-level information in the scene, i.e., the spatial contextual information of local regions, to guide the reconstruction. Specifically, a novel structure is proposed, namely, {\textit{track-community}}, in which each community consists of a group of tracks and represents a local segment in the scene. A community detection algorithm is used to partition the scene into several segments. Then, the potential ambiguous segments are detected by analyzing the neighborhood of tracks and corrected by checking the pose consistency. Finally, we perform partial reconstruction on each segment and align them with a novel bidirectional consistency cost function which considers both 3D-3D correspondences and pairwise relative camera poses. Experimental results demonstrate that our approach can robustly alleviate reconstruction failure resulting from visually indistinguishable structures and accurately merge the partial reconstructions.
翻译:从结构到动态(SfM) 旨在根据输入图像之间的对应关系,恢复3D场景结构和摄像头结构,根据输入图像之间的对应关系恢复3D场景结构和摄像器,因此,由于重复结构(即具有强烈视觉相似性的不同结构)造成的模糊性,总是造成不正确的相机和3D结构。为了应对模糊性,大多数现有研究都采用额外的限制信息或隐含的推断,方法是分析双视图的几处或特征点。在本文件中,我们提议利用场景中的高层次信息,即当地各地区的空间背景信息,来指导重建。具体地说,提出了一个新的结构,即:(htextit{trapy-community}),每个社区组成一组轨道,并代表现场的一个局部部分。一个社区检测算法用来将场景分成几个部分。然后,通过分析轨道的相邻点或特征点校正来检测潜在的模糊性部分。最后,我们对每个部分进行部分进行部分部分部分部分的部分重建,并使它们与新的双向一致成本功能相协调,该功能考虑到3D-3D通信和对齐相相对相相的相对摄像机。 实验性部分的重建可以精确地表明我们进行重建的重建的失败的整。