Homography estimation is an important task in computer vision, such as image stitching, video stabilization, and camera calibration. Traditional homography estimation methods heavily depend on the quantity and distribution of feature points, leading to poor robustness in textureless scenes. The learning solutions, on the contrary, try to learn robust deep features but demonstrate unsatisfying performance in the scenes of low overlap rates. In this paper, we address the two problems simultaneously, by designing a contextual correlation layer, which can capture the long-range correlation on feature maps and flexibly be bridged in a learning framework. In addition, considering that a single homography can not represent the complex spatial transformation in depth-varying images with parallax, we propose to predict multi-grid homography from global to local. Moreover, we equip our network with depth perception capability, by introducing a novel depth-aware shape-preserved loss. Extensive experiments demonstrate the superiority of our method over other state-of-the-art solutions in the synthetic benchmark dataset and real-world dataset. The codes and models will be available at https://github.com/nie-lang/Multi-Grid-Deep-Homogarphy.
翻译:在计算机视觉方面,如图像缝合、视频稳定、相机校准等,对同性恋进行估计是一项重要任务。传统的同系估计方法在很大程度上取决于特征点的数量和分布,导致无纹理场景的强度差。相反,学习的解决方案试图学习强健的深处特征,但在低重叠率的场景中表现出不满意的性能。在本文中,我们同时处理这两个问题,方法是设计一个相关关系层,能够捕捉地貌图上的长距离相关性,并在一个学习框架中灵活地连接。此外,考虑到单种同系无法代表用准光谱进行深度变化的图像的复杂空间变化,我们提议预测从全球到地方的多电网同系。此外,我们通过引入一种新型的深度观测形状预知损失,使我们的网络具备深度感知能力。广泛的实验表明,我们的方法优于合成基准数据集和现实世界数据集中的其他状态-艺术解决方案。代码和模型将在https://github.com/nie-lang/MG-Gri-hi-Iep-Hriev-Iep上提供。