We introduce a lightweight network to improve descriptors of keypoints within the same image. The network takes the original descriptors and the geometric properties of keypoints as the input, and uses an MLP-based self-boosting stage and a Transformer-based cross-boosting stage to enhance the descriptors. The enhanced descriptors can be either real-valued or binary ones. We use the proposed network to boost both hand-crafted (ORB, SIFT) and the state-of-the-art learning-based descriptors (SuperPoint, ALIKE) and evaluate them on image matching, visual localization, and structure-from-motion tasks. The results show that our method significantly improves the performance of each task, particularly in challenging cases such as large illumination changes or repetitive patterns. Our method requires only 3.2ms on desktop GPU and 27ms on embedded GPU to process 2000 features, which is fast enough to be applied to a practical system.
翻译:我们引入了一个轻量级的网络来改进同一图像中关键点的描述符。 网络将最初的描述符和关键点的几何特性作为输入, 并使用基于 MLP 的自我促进阶段和基于变换器的交叉促进阶段来增强描述符。 增强的描述符可以是真实价值的,也可以是二进制的。 我们使用拟议的网络来提升手工制作的描述符( ORB, SIFT) 和基于最新学习的描述符( SuperPoint, ALIKE), 并评估图像匹配、 视觉本地化和结构变化任务。 结果显示, 我们的方法极大地改进了每项任务的业绩, 特别是在诸如大污染变化或重复模式等具有挑战性的案件中。 我们的方法只需要在桌面 GPU 上安装3.2m 和 嵌入式 GPU 2000 进程功能上安装了 27ms, 快速地应用于一个实用的系统。