Stereo image pairs encode 3D scene cues into stereo correspondences between the left and right images. To exploit 3D cues within stereo images, recent CNN based methods commonly use cost volume techniques to capture stereo correspondence over large disparities. However, since disparities can vary significantly for stereo cameras with different baselines, focal lengths and resolutions, the fixed maximum disparity used in cost volume techniques hinders them to handle different stereo image pairs with large disparity variations. In this paper, we propose a generic parallax-attention mechanism (PAM) to capture stereo correspondence regardless of disparity variations. Our PAM integrates epipolar constraints with attention mechanism to calculate feature similarities along the epipolar line to capture stereo correspondence. Based on our PAM, we propose a parallax-attention stereo matching network (PASMnet) and a parallax-attention stereo image super-resolution network (PASSRnet) for stereo matching and stereo image super-resolution tasks. Moreover, we introduce a new and large-scale dataset named Flickr1024 for stereo image super-resolution. Experimental results show that our PAM is generic and can effectively learn stereo correspondence under large disparity variations in an unsupervised manner. Comparative results show that our PASMnet and PASSRnet achieve the state-of-the-art performance.
翻译:立体立体图像配对以编码 3D 场景显示立体图像的立体图像对立体图像对立体图像。为了在立体图像中利用立体信号信号,最近的有线电视方法通常使用成本量技术,在巨大差异中捕捉立体通信。然而,由于对具有不同基线、焦距和分辨率的立体摄像机来说,差异很大,在成本量技术中使用固定的最大差异使得它们无法处理不同立体图像对立体图像对立体图像。在本文件中,我们提议建立一个通用的准对立体识别机制(PAM),以获取立体图像对立体通信。我们的PAM将上极限制与关注机制相结合,以计算上极线上的特征相似性来捕捉立体通信。基于我们的PAM,我们提议建立一个有不同基线、焦距立体立体立体立体立体声音匹配网络(PASMnet)和对立体立体图像超分辨率网络(PAS-PAS-SAS-SM)的运行方式。