Image keypoints and descriptors play a crucial role in many visual measurement tasks. In recent years, deep neural networks have been widely used to improve the performance of keypoint and descriptor extraction. However, the conventional convolution operations do not provide the geometric invariance required for the descriptor. To address this issue, we propose the Sparse Deformable Descriptor Head (SDDH), which learns the deformable positions of supporting features for each keypoint and constructs deformable descriptors. Furthermore, SDDH extracts descriptors at sparse keypoints instead of a dense descriptor map, which enables efficient extraction of descriptors with strong expressiveness. In addition, we relax the neural reprojection error (NRE) loss from dense to sparse to train the extracted sparse descriptors. Experimental results show that the proposed network is both efficient and powerful in various visual measurement tasks, including image matching, 3D reconstruction, and visual relocalization.
翻译:图像关键点和描述符在许多视觉测量任务中起着至关重要的作用。近年来,深度神经网络已广泛用于改进关键点和描述符提取的性能。然而,传统的卷积操作不提供描述符所需的几何不变性。为解决这个问题,我们提出了稀疏可变变形描述符头(SDDH),该头部学习每个关键点支持特征的可变形位置并构建可变形描述符。此外,SDDH提取稀疏关键点处的描述符而不是密集的描述符地图,这使得描述符的高表达力提取变得高效。此外,我们还将神经重投影误差(NRE)损失从密集变松散,用于训练提取的稀疏描述符。实验结果表明,所提出的网络在各种视觉测量任务中效率和性能均得到了提升,包括图像匹配、三维重建和视觉重定位。