Accurate semantic segmentation models typically require significant computational resources, inhibiting their use in practical applications. Recent works rely on well-crafted lightweight models to achieve fast inference. However, these models cannot flexibly adapt to varying accuracy and efficiency requirements. In this paper, we propose a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference depending on the desired accuracy-efficiency tradeoff. More specifically, we employ parametrized channel slimming by stepwise downward knowledge distillation during training. Motivated by the observation that the differences between segmentation results of each submodel are mainly near the semantic borders, we introduce an additional boundary guided semantic segmentation loss to further improve the performance of each submodel. We show that our proposed SlimSeg with various mainstream networks can produce flexible models that provide dynamic adjustment of computational cost and better performance than independent models. Extensive experiments on semantic segmentation benchmarks, Cityscapes and CamVid, demonstrate the generalization ability of our framework.
翻译:精确的语义分解模型通常需要大量的计算资源,从而无法在实际应用中加以使用。最近的工作依靠精心设计的轻量型模型来实现快速推算。然而,这些模型无法灵活地适应不同的准确性和效率要求。在本文件中,我们提议了一种简单而有效的细微语义分解(SlimSeg)方法,在推论过程中可以以不同的能力执行,这取决于所期望的准确性-效率权衡。更具体地说,我们采用对称化通道微薄,在培训期间通过逐步向下知识蒸馏来进行微缩。我们受以下观察的驱动,即每个子模型的分解结果主要在语义边界附近,我们引入了额外的边界引导语义分解损失,以进一步改进每个子模型的性能。我们表明,我们提议的SlimSegleg与各种主流网络可以产生灵活的模型,对计算成本进行动态调整,并且比独立模型的性能更好。关于语义分解基准、城市景和Camvid的大规模实验,展示了我们框架的一般能力。