In this paper, we address the problem of estimating scale factors between images. We formulate the scale estimation problem as a prediction of a probability distribution over scale factors. We design a new architecture, ScaleNet, that exploits dilated convolutions as well as self and cross-correlation layers to predict the scale between images. We demonstrate that rectifying images with estimated scales leads to significant performance improvements for various tasks and methods. Specifically, we show how ScaleNet can be combined with sparse local features and dense correspondence networks to improve camera pose estimation, 3D reconstruction, or dense geometric matching in different benchmarks and datasets. We provide an extensive evaluation on several tasks and analyze the computational overhead of ScaleNet. The code, evaluation protocols, and trained models are publicly available at https://github.com/axelBarroso/ScaleNet.
翻译:在本文中,我们探讨在图像之间估计比例因素的问题。我们将比例估计问题作为对比例系数的概率分布的预测来进行。我们设计了一个新的结构,即ScapeNet,它利用了放大变异以及自我和交叉关系层来预测图像之间的比例。我们证明,用估计比例校正图像可以大大改进各种任务和方法的性能。具体地说,我们展示了ScapeNet如何与稀少的地方特征和密集的通信网络相结合,以改进照相机构成估计、3D重建或不同基准和数据集的密集几何匹配。我们广泛评价了ScapeatNet的几项任务并分析其计算间接费用。代码、评估程序以及经过培训的模型可在https://github.com/axelBarroso/ScaseNet上公开查阅。