Recent advanced studies have spent considerable human efforts on optimizing network architectures for stereo matching but hardly achieved both high accuracy and fast inference speed. To ease the workload in network design, neural architecture search (NAS) has been applied with great success to various sparse prediction tasks, such as image classification and object detection. However, existing NAS studies on the dense prediction task, especially stereo matching, still cannot be efficiently and effectively deployed on devices of different computing capabilities. To this end, we propose to train an elastic and accurate network for stereo matching (EASNet) that supports various 3D architectural settings on devices with different computing capabilities. Given the deployment latency constraint on the target device, we can quickly extract a sub-network from the full EASNet without additional training while the accuracy of the sub-network can still be maintained. Extensive experiments show that our EASNet outperforms both state-of-the-art human-designed and NAS-based architectures on Scene Flow and MPI Sintel datasets in terms of model accuracy and inference speed. Particularly, deployed on an inference GPU, EASNet achieves a new SOTA 0.73 EPE on the Scene Flow dataset with 100 ms, which is 4.5$\times$ faster than LEAStereo with a better quality model.
翻译:最近进行的先进研究在优化立体匹配网络架构方面花费了大量人力,但几乎没有达到高精度和快速推导速度。为了减轻网络设计的工作量,对各种稀疏的预测任务,如图像分类和天体探测等应用了神经结构搜索(NAS),取得了巨大成功。然而,现有NAS关于密集的预测任务,特别是立体匹配的研究,仍然无法高效率和有效地部署在不同计算能力装置上。为此,我们提议为立体匹配(EASNet)培训一个弹性和准确的立体匹配网络(EASNet),以支持不同计算能力装置上的各种3D建筑设置。鉴于目标装置的部署耐久性限制,我们可以在没有额外培训的情况下迅速从全 EASNet中提取一个子网络,而次级网络的准确性仍然可以保持。广泛的实验表明,我们的EASNet在模型流程和MPI Sintelente数据集方面,在模型精确度和推导速度方面都比SKAPE更快。特别是,在预测GPU值GPE上部署了一个小网,E,EASNet能够更快地实现比SMANAMQMLA的更快的数据。