We introduce a comprehensive benchmark for local features and robust estimation algorithms, focusing on the downstream task -- the accuracy of the reconstructed camera pose -- as our primary metric. Our pipeline's modular structure allows easy integration, configuration, and combination of different methods and heuristics. This is demonstrated by embedding dozens of popular algorithms and evaluating them, from seminal works to the cutting edge of machine learning research. We show that with proper settings, classical solutions may still outperform the perceived state of the art. Besides establishing the actual state of the art, the conducted experiments reveal unexpected properties of Structure from Motion (SfM) pipelines that can help improve their performance, for both algorithmic and learned methods. Data and code are online https://github.com/vcg-uvic/image-matching-benchmark, providing an easy-to-use and flexible framework for the benchmarking of local features and robust estimation methods, both alongside and against top-performing methods. This work provides a basis for the Image Matching Challenge https://vision.uvic.ca/image-matching-challenge.
翻译:我们引入了本地特征和稳健估算算法的全面基准,以下游任务 -- -- 重建相机的准确性构成 -- -- 作为我们的主要衡量标准。我们的管道的模块结构可以方便地整合、配置和组合不同的方法和超自然学。这表现在嵌入了几十种流行的算法并对其进行了评估,从原始工程到机器学习研究的前沿。我们表明,有了适当的环境,典型的解决方案仍然可能比想象的先进水平要好。除了确定实际的先进水平外,所进行的实验还揭示出动态(SfM)管道结构的意外特性,这些结构对于算法和学习方法来说都能够帮助改进它们的性能。数据和代码在网上 https://github.com/vcg-uvic/image-matching-benchmark, 提供了一个易于使用和灵活的框架,用以确定本地特征的基准和稳健的估计方法,同时并对照最高性能方法。这项工作为图像匹配挑战 https://vision.uvic.ca/imageing-matting-matting-challengeenge提供了基础。