For real-time semantic segmentation, how to increase the speed while maintaining high resolution is a problem that has been discussed and solved. Backbone design and fusion design have always been two essential parts of real-time semantic segmentation. We hope to design a light-weight network based on previous design experience and reach the level of state-of-the-art real-time semantic segmentation without any pre-training. To achieve this goal, a encoder-decoder architectures are proposed to solve this problem by applying a decoder network onto a backbone model designed for real-time segmentation tasks and designed three different ways to fuse semantics and detailed information in the aggregation phase. We have conducted extensive experiments on two semantic segmentation benchmarks. Experiments on the Cityscapes and CamVid datasets show that the proposed FRFNet strikes a balance between speed calculation and accuracy. It achieves 69% Mean Intersection over Union (mIoU%) on the Cityscapes test dataset with the speed of 132on a single RTX 2080Ti card. The Code is available at https://github.com/favoMJ/FRFNet.
翻译:对于实时语义分解,在保持高分辨率的同时如何提高速度是一个已经讨论和解决的问题。 Backbone 设计和聚合设计始终是实时语义分解的两个必要部分。我们希望根据先前的设计经验设计一个轻量网络,并达到最先进的实时语义分解水平,而无需经过任何培训。为了实现这一目标,提议了一个编码解码器-解码器结构来解决这个问题,办法是将解码器网络应用到为实时分解任务设计的主干模型上,并设计出三种不同的方法来结合语义和汇总阶段的详细信息。我们已经对两个语义分解基准进行了广泛的实验。城市景和CamVid数据集实验显示,拟议的FRFNet在速度计算和准确性之间取得了平衡。在城市景区测试数据集上实现了69%的中间分解器(mIoU%),其速度为132on 单 RTX 2080-Ti。该代码可在 https://giuth/Mfongavo.com查阅 https://refrob.