The local feature detector and descriptor are essential in many computer vision tasks, such as SLAM and 3D reconstruction. In this paper, we introduce two separate CNNs, lightweight SobelNet and DesNet, to detect key points and to compute dense local descriptors. The detector and the descriptor work in parallel. Sobel filter provides the edge structure of the input images as the input of CNN. The locations of key points will be obtained after exerting the non-maximum suppression (NMS) process on the output map of the CNN. We design Gaussian loss for the training process of SobelNet to detect corner points as keypoints. At the same time, the input of DesNet is the original grayscale image, and circle loss is used to train DesNet. Besides, output maps of SobelNet are needed while training DesNet. We have evaluated our method on several benchmarks including HPatches benchmark, ETH benchmark, and FM-Bench. SobelNet achieves better or comparable performance with less computation compared with SOTA methods in recent years. The inference time of an image of 640x480 is 7.59ms and 1.09ms for SobelNet and DesNet respectively on RTX 2070 SUPER.
翻译:本地地物检测器和描述器对于许多计算机愿景任务(如 SLAM 和 3D 重建 ) 至关重要, 例如 SLAM 和 3D 重建等 。 在本文中, 我们引入了两个单独的CNN 、 轻量级 SobelNet 和 DesNet, 以探测关键点和计算密集的本地描述器。 检测器和描述器平行工作。 Sobel 过滤器提供输入图像作为CNN 输入的边缘结构。 关键点的位置将在对CNN 输出图中应用非最大抑制( NMS) 程序之后获得 。 我们设计 SobelNet 培训过程的Gaussian 损失, 以探测作为关键点的角落点。 同时, DesNet 的输入是原始灰度图像, 用于培训 DesNet 。 此外, 在培训 DesNet 时需要 SobelNet 的输出图 。 我们在几个基准上评估了我们的方法, 包括 HPatches 基准、 ET 基准 和 FM- Bench 。 SobelNet 。 与 SOTA 方法相比, 的计算得更好或可比。 。 。 。 。 DSOTATTA 。 。 DS.09 和S. 0909 和 DS.