Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation. The problem nevertheless is not trivial especially for the real-world deployments, which often demand high efficiency in inference latency. In this paper, we thoroughly analyze the design of convolutional blocks (the type of convolutions and the number of channels in convolutions), and the ways of interactions across multiple scales, all from lightweight standpoint for semantic segmentation. With such in-depth comparisons, we conclude three principles, and accordingly devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner. Technically, LPS-Net first capitalizes on the principles to build a tiny network. Then, LPS-Net progressively scales the tiny network to larger ones by expanding a single dimension (the number of convolutional blocks, the number of channels, or the input resolution) at one time to meet the best speed/accuracy tradeoff. Extensive experiments conducted on three datasets consistently demonstrate the superiority of LPS-Net over several efficient semantic segmentation methods. More remarkably, our LPS-Net achieves 73.4% mIoU on Cityscapes test set, with the speed of 413.5FPS on an NVIDIA GTX 1080Ti, leading to a performance improvement by 1.5% and a 65% speed-up against the state-of-the-art STDC. Code is available at \url{https://github.com/YihengZhang-CV/LPS-Net}.
翻译:多尺度学习框架被视为一种能够提升语义分解的模型。然而,问题并不小,特别是对于现实世界的部署来说,它往往要求高效率的推论延迟。在本文中,我们彻底分析革命区块的设计(卷变类型和卷变渠道的数量),以及从轻量角度对语义分解进行多种规模互动的方式。通过这种深入的比较,我们得出了三项原则,并相应地设计了轻量和渐进式可变网络(LPS-Net),这些网络以贪婪的方式新增加了网络的复杂性。技术上,LPS-Net首先利用原则来建立一个小网络。然后,LPS-Net逐步将小型网络的规模扩大到更大的区域,同时扩大一个单一的维度(卷变区的数量、频道的数量或投入解析),以达到最佳速度/准确性交易。在三个数据集上进行的广泛实验,持续显示LPS-Net优于若干高效的SVS-SL-Sl-S-Sleval-deal-deal Servical Seral-deal dest 方法。