The extensive computational burden limits the usage of CNNs in mobile devices for dense estimation tasks. In this paper, we present a lightweight network to address this problem,namely LEDNet, which employs an asymmetric encoder-decoder architecture for the task of real-time semantic segmentation.More specifically, the encoder adopts a ResNet as backbone network, where two new operations, channel split and shuffle, are utilized in each residual block to greatly reduce computation cost while maintaining higher segmentation accuracy. On the other hand, an attention pyramid network (APN) is employed in the decoder to further lighten the entire network complexity. Our model has less than 1M parameters,and is able to run at over 71 FPS in a single GTX 1080Ti GPU. The comprehensive experiments demonstrate that our approach achieves state-of-the-art results in terms of speed and accuracy trade-off on CityScapes dataset.
翻译:广泛的计算负担限制了CNN在移动设备中用于密集估计任务。 在本文中,我们提出了一个轻量级网络来解决这个问题,即LEDNet,它使用一个不对称编码器解码器-编码器结构来实时进行语义分解。 特别是,编码器将ResNet作为主干网,在每个剩余区段使用两个新的操作,即频道分割和打乱,以大大降低计算成本,同时保持较高的分离精确度。另一方面,在解码器中使用了一个注意金字塔网络(APN),以进一步减轻整个网络的复杂性。我们的模型只有不到1M参数,并且能够在单一的GTX1080Ti GPU中运行超过71个FPS。 全面实验表明,我们的方法在CityScapes数据集的速度和准确交易方面达到了最新的结果。