In this paper we address issues with image retrieval benchmarking on standard and popular Oxford 5k and Paris 6k datasets. In particular, annotation errors, the size of the dataset, and the level of challenge are addressed: new annotation for both datasets is created with an extra attention to the reliability of the ground truth. Three new protocols of varying difficulty are introduced. The protocols allow fair comparison between different methods, including those using a dataset pre-processing stage. For each dataset, 15 new challenging queries are introduced. Finally, a new set of 1M hard, semi-automatically cleaned distractors is selected. An extensive comparison of the state-of-the-art methods is performed on the new benchmark. Different types of methods are evaluated, ranging from local-feature-based to modern CNN based methods. The best results are achieved by taking the best of the two worlds. Most importantly, image retrieval appears far from being solved.
翻译:在本文中,我们讨论了标准以及流行的牛津5k和巴黎6k数据集的图像检索基准问题,特别是注解错误、数据集的大小以及挑战程度;两个数据集的新注解创建了新的注解,对地面真相的可靠性给予了额外关注;引入了三种不同的新协议;协议允许对不同方法进行公平比较,包括使用数据集预处理阶段的方法;每个数据集都引入了15个新的富有挑战性的查询;最后,选择了一套新的1M硬式半自动清除分散器;对新基准进行了最先进的方法的广泛比较;对不同类型的方法进行了评估,从基于本地的功能到基于现代CNN的方法;最佳成果是通过利用两个世界的最佳方法实现的;最重要的是,图像检索似乎远未解决。