We propose an attentive local feature descriptor suitable for large-scale image retrieval, referred to as DELF (DEep Local Feature). The new feature is based on convolutional neural networks, which are trained only with image-level annotations on a landmark image dataset. To identify semantically useful local features for image retrieval, we also propose an attention mechanism for keypoint selection, which shares most network layers with the descriptor. This framework can be used for image retrieval as a drop-in replacement for other keypoint detectors and descriptors, enabling more accurate feature matching and geometric verification. Our system produces reliable confidence scores to reject false positives---in particular, it is robust against queries that have no correct match in the database. To evaluate the proposed descriptor, we introduce a new large-scale dataset, referred to as Google-Landmarks dataset, which involves challenges in both database and query such as background clutter, partial occlusion, multiple landmarks, objects in variable scales, etc. We show that DELF outperforms the state-of-the-art global and local descriptors in the large-scale setting by significant margins. Code and dataset can be found at the project webpage: https://github.com/tensorflow/models/tree/master/research/delf .
翻译:我们建议一个适合大规模图像检索的注意本地特征描述器,称为 DELF( Deep 本地特征)。新的特征基于革命性神经网络,这些网络仅受过里程碑图像数据集图像级说明的培训。为了确定用于图像检索的语义上有用的本地特征,我们还提议一个关键点选择的注意机制,这些关键点与描述器共享大多数网络层。这个框架可用于图像检索,以替代其他关键点探测器和描述器,使功能匹配和几何校验更加准确。我们的系统产生可靠的信心分数,以拒绝虚假正数,特别是,它对于数据库中不正确匹配的查询是强有力的。为了评估拟议的描述器,我们引入了一个新的大规模数据集,称为Google-Landmarks数据集,它涉及数据库和查询中的大多数挑战。这个框架可用于图像检索,作为其他关键点检测器和描述器的下降替换器,使得功能匹配和几何等分校校校校校。我们显示,DELF在拒绝错误的状态-art全球和地方描述器- 特别是,它对于数据库中没有正确匹配的查询器的查询器。为了评价拟议中的描述器/ apressional/ describrestrationormal / apress