Deep learning has demonstrated its power in image rectification by leveraging the representation capacity of deep neural networks via supervised training based on a large-scale synthetic dataset. However, the model may overfit the synthetic images and generalize not well on real-world fisheye images due to the limited universality of a specific distortion model and the lack of explicitly modeling the distortion and rectification process. In this paper, we propose a novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of a same scene from different lens should be the same. Specifically, we devise a new network architecture with a shared encoder and several prediction heads, each of which predicts the distortion parameter of a specific distortion model. We further leverage a differentiable warping module to generate the rectified images and re-distorted images from the distortion parameters and exploit the intra- and inter-model consistency between them during training, thereby leading to a self-supervised learning scheme without the need for ground-truth distortion parameters or normal images. Experiments on synthetic dataset and real-world fisheye images demonstrate that our method achieves comparable or even better performance than the supervised baseline method and representative state-of-the-art methods. Self-supervised learning also improves the universality of distortion models while keeping their self-consistency.
翻译:通过大型合成数据集的监督下培训,利用深神经网络的模拟能力,通过利用深神经网络的模拟能力,在图像校正方面表现出了深刻的学习力量;然而,由于具体扭曲模型的普遍性有限,而且缺乏扭曲和纠正过程的明确模型,模型可能超越合成图像,在现实世界的鱼眼图像上不甚完善;在本文中,我们建议采用一种新的自我监督图像校正方法,其依据是:重要的洞察力,即同一场景的扭曲图像的校正结果应当相同。具体地说,我们设计了一个新的网络结构,配有共同的编码器和若干预测头,其中每一个都预测了具体扭曲模型的扭曲参数。我们进一步利用一个不同的扭曲模块,从扭曲参数中产生校正的图像和重新校正图像,并在培训期间利用模型内部和相互监督的图像校正一致性,从而形成一个自我监督的学习计划,而无需地面真相变形参数或正常图像。对合成数据集和真实世界的鱼眼图像进行实验,每个模型都预测具体扭曲模型的扭曲参数和几个,每个模型都预测具体的扭曲参数。我们进一步利用一个不同的扭曲模型的扭曲模型的扭曲模型来生成模型来产生不同的扭曲参数和自我校正,同时我们的方法也能够改进了自我的自我测测测测测制方法。