Existing methods detect the keypoints in a non-differentiable way, therefore they can not directly optimize the position of keypoints through back-propagation. To address this issue, we present a differentiable keypoint detection module, which outputs accurate sub-pixel keypoints. The reprojection loss is then proposed to directly optimize these sub-pixel keypoints, and the dispersity peak loss is presented for accurate keypoints regularization. We also extract the descriptors in a sub-pixel way, and they are trained with the stable neural reprojection error loss. Moreover, a lightweight network is designed for keypoint detection and descriptor extraction, which can run at 95 frames per second for 640x480 images on a commercial GPU. On homography estimation, camera pose estimation, and visual (re-)localization tasks, the proposed method achieves equivalent performance with the state-of-the-art approaches, while greatly reduces the inference time.
翻译:现有方法以不可区别的方式检测关键点, 因此它们无法直接通过反向分析优化关键点的位置 。 为了解决这个问题, 我们提出了一个不同的关键点检测模块, 该模块输出准确的子像素关键点 。 然后建议再预测损失直接优化这些子像素关键点, 分散性峰值损失用于精确的关键点规范化 。 我们还以次像素方式提取解记器, 并且对它们进行稳定的神经再投射错误损失培训 。 此外, 一个轻量网络是为关键点检测和描述提取设计的, 可以在商业 GPU 上以每秒95个框架运行640x480 图像。 关于同系估计、 相机显示估计和视觉( 重新) 定位任务, 拟议的方法实现了与最新技术方法同等的性能, 同时大大缩短了推论时间 。