We tackle the problem of estimating flow between two images with large lighting variations. Recent learning-based flow estimation frameworks have shown remarkable performance on image pairs with small displacement and constant illuminations, but cannot work well on cases with large viewpoint change and lighting variations because of the lack of pixel-wise flow annotations for such cases. We observe that via the Structure-from-Motion (SfM) techniques, one can easily estimate relative camera poses between image pairs with large viewpoint change and lighting variations. We propose a novel weakly supervised framework LIFE to train a neural network for estimating accurate lighting-invariant flows between image pairs. Sparse correspondences are conventionally established via feature matching with descriptors encoding local image contents. However, local image contents are inevitably ambiguous and error-prone during the cross-image feature matching process, which hinders downstream tasks. We propose to guide feature matching with the flows predicted by LIFE, which addresses the ambiguous matching by utilizing abundant context information in the image pairs. We show that LIFE outperforms previous flow learning frameworks by large margins in challenging scenarios, consistently improves feature matching, and benefits downstream tasks.
翻译:我们发现,通过结构-移动(SfM)技术,人们可以很容易地估计两个图像之间的流动量,而两个图像的光差则很大。最近学习基础的流量估计框架显示,在图像配对中,小置换和恒定的照明效果显著,但是,由于缺乏像素-顺流说明,无法很好地处理观点变化和照明变化较大的图像配对情况。我们提议建立一个新颖的薄弱监督框架“Life”来训练神经网络,以估计图像配对之间准确的照明-变异流动情况。异常的通信通常通过与标注编码当地图像内容的特征匹配来建立。然而,在交叉图像配对时,本地图像内容不可避免地模糊和易出错,这阻碍了下游任务。我们提议通过利用图像配方的丰富背景信息来引导特征与生命所预测的流量匹配,从而解决模糊的匹配问题。我们发现,Life在前一个流动量框架上比强得多,在具有挑战性的情景中,不断改进特征匹配和收益下游任务。