The cost aggregation strategy shows a crucial role in learning-based stereo matching tasks, where 3D convolutional filters obtain state of the art but require intensive computation resources, while 2D operations need less GPU memory but are sensitive to domain shift. In this paper, we decouple the 4D cubic cost volume used by 3D convolutional filters into sequential cost maps along the direction of disparity instead of dealing with it at once by exploiting a recurrent cost aggregation strategy. Furthermore, a novel recurrent module, Stacked Recurrent Hourglass (SRH), is proposed to process each cost map. Our hourglass network is constructed based on Gated Recurrent Units (GRUs) and down/upsampling layers, which provides GRUs larger receptive fields. Then two hourglass networks are stacked together, while multi-scale information is processed by skip connections to enhance the performance of the pipeline in textureless areas. The proposed architecture is implemented in an end-to-end pipeline and evaluated on public datasets, which reduces GPU memory consumption by up to 56.1\% compared with PSMNet using stacked hourglass 3D CNNs without the degradation of accuracy. Then, we further demonstrate the scalability of the proposed method on several high-resolution pairs, while previously learned approaches often fail due to the memory constraint. The code is released at \url{https://github.com/hongzhidu/SRHNet}.
翻译:成本汇总战略表明,在基于学习的立体匹配任务中,3D革命过滤器获得最新水平,但需要大量计算资源,而2D行动需要的GPU记忆较少,但对域变换敏感。在本文件中,我们将3D革命过滤器使用的4D立方成本卷分解成沿差异方向的连续成本图,而不是同时利用经常性成本汇总战略来应对差异。此外,还提议处理每个成本图。我们的沙漏网络以Gatedic Compater(GRUs)和下层/上层为基础,提供GRUs更大的可容纳字段。然后,我们把两个小时玻璃网络叠在一起,同时通过跳过连接处理多级信息,以提高输油管在无纹带地区的性能,而不是同时利用经常性费用汇总战略,在公共数据集上执行拟议的结构,将GPUPU的记忆消耗量降低到56.1 ⁇,而PSMHNet则使用堆装的3D CNINSMS,然后又在高清晰度解度上进一步展示,随后,我们在高清晰度解解法下展示。