Although convolution neural network based stereo matching architectures have made impressive achievements, there are still some limitations: 1) Convolutional Feature (CF) tends to capture appearance information, which is inadequate for accurate matching. 2) Due to the static filters, current convolution based disparity refinement modules often produce over-smooth results. In this paper, we present two schemes to address these issues, where some traditional wisdoms are integrated. Firstly, we introduce a pairwise feature for deep stereo matching networks, named LSP (Local Similarity Pattern). Through explicitly revealing the neighbor relationships, LSP contains rich structural information, which can be leveraged to aid CF for more discriminative feature description. Secondly, we design a dynamic self-reassembling refinement strategy and apply it to the cost distribution and the disparity map respectively. The former could be equipped with the unimodal distribution constraint to alleviate the over-smoothing problem, and the latter is more practical. The effectiveness of the proposed methods is demonstrated via incorporating them into two well-known basic architectures, GwcNet and GANet-deep. Experimental results on the SceneFlow and KITTI benchmarks show that our modules significantly improve the performance of the model.
翻译:虽然以立体相匹配的立体相匹配结构取得了令人印象深刻的成就,但仍存在一些局限性:(1) 进化功能(CF)倾向于捕捉外貌信息,而这种信息不足以准确匹配。(2) 由于静态过滤器,当前以进化为基础的差异完善模块往往产生超大的结果。在本文件中,我们提出了解决这些问题的两个方案,即将一些传统智慧融合在一起。首先,我们为深层次立体匹配网络引入一个双向特征,名为LSP(当地相似模式),通过明确披露邻居关系,LSP包含丰富的结构信息,可以用来帮助CF进行更具有歧视性的特点描述。第二,我们设计了动态的自我重组精细化战略,并分别应用于成本分布和差异图。前者可以配有单一的分布限制,以缓解过度拥挤问题,而后者则更为实用。通过将其纳入两个广为人知的基本结构(GwcNet和GANet),可以证明拟议方法的有效性。在SceneFlow和KITTI的基准中,实验性结果显示我们的模块将显著改进。