PP-MobileSeg: 在移动设备上探索快速准确的语义分割模型 (PP-MobileSeg: Explore the Fast and Accurate Semantic Segmentation Model on Mobile Devices) - 专知论文

会员服务 ·

0

移动设备 · 语义特征 · Vim · 语义分割 · 分割 ·

2023 年 4 月 11 日

PP-MobileSeg: Explore the Fast and Accurate Semantic Segmentation Model on Mobile Devices

翻译：PP-MobileSeg: 在移动设备上探索快速准确的语义分割模型

Shiyu Tang,Ting Sun,Juncai Peng,Guowei Chen,Yuying Hao,Manhui Lin,Zhihong Xiao,Jiangbin You,Yi Liu

from arxiv, 8 pages, 3 figures

The success of transformers in computer vision has led to several attempts to adapt them for mobile devices, but their performance remains unsatisfactory in some real-world applications. To address this issue, we propose PP-MobileSeg, a semantic segmentation model that achieves state-of-the-art performance on mobile devices. PP-MobileSeg comprises three novel parts: the StrideFormer backbone, the Aggregated Attention Module (AAM), and the Valid Interpolate Module (VIM). The four-stage StrideFormer backbone is built with MV3 blocks and strided SEA attention, and it is able to extract rich semantic and detailed features with minimal parameter overhead. The AAM first filters the detailed features through semantic feature ensemble voting and then combines them with semantic features to enhance the semantic information. Furthermore, we proposed VIM to upsample the downsampled feature to the resolution of the input image. It significantly reduces model latency by only interpolating classes present in the final prediction, which is the most significant contributor to overall model latency. Extensive experiments show that PP-MobileSeg achieves a superior tradeoff between accuracy, model size, and latency compared to other methods. On the ADE20K dataset, PP-MobileSeg achieves 1.57% higher accuracy in mIoU than SeaFormer-Base with 32.9% fewer parameters and 42.3% faster acceleration on Qualcomm Snapdragon 855. Source codes are available at https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.8.

翻译：计算机视觉领域中Transformers的成功应用，引发了将其适应于移动设备的多个尝试，然而这些方法在一些实际应用中的性能仍然不理想。为了解决这个问题，我们提出了PP-MobileSeg，一个在移动设备上实现最新技术性能的语义分割模型。PP-MobileSeg由三个新颖的组成部分组成：StrdeFormer骨干网络、聚合注意力模块（AAM）和有效插值模块（VIM）。四级StrdeFormer骨干网络通过MV3块和跨越式SEA注意力实现，能够以最小的参数开销提取丰富的语义和详细特征。AAM首先通过语义特征组合投票来过滤详细特征，然后将它们与语义特征相结合，以增强语义信息。此外，我们提出了VIM将向下采样的特征上采样到输入图像的分辨率，它通过仅插值最终预测中存在的类别，大幅度降低了模型延迟，这是整个模型延迟的最主要贡献因素。大量实验表明，PP-MobileSeg在精度、模型大小和延迟方面取得了优秀的权衡。在ADE20K数据集上，PP-MobileSeg比基于SeaFormer的32.9％更少的参数和42.3％更快的加速下，mIoU精度高1.57％。源代码可在 https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.8 获取。

0

相关内容

移动设备

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

专知会员服务

12+阅读 · 2022年3月9日

【ICML2021】生成式视频转换器Transformers: 物体可以是文字吗?

专知会员服务

13+阅读 · 2021年8月20日

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

专知会员服务

44+阅读 · 2021年3月15日

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

专知会员服务

26+阅读 · 2020年3月16日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

专知会员服务

63+阅读 · 2020年2月16日

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

专知会员服务

27+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

PyTorch语义分割开源库semseg

PyTorch语义分割开源库semseg

极市平台

25+阅读 · 2019年6月6日

南邮提出实时语义分割的轻量级网络：LEDNET，可达 71 FPS！70.6% class mIoU！即将开源

南邮提出实时语义分割的轻量级网络：LEDNET，可达 71 FPS！70.6% class mIoU！即将开源

极市平台

17+阅读 · 2019年5月10日

实战 | 源码入门之Faster RCNN

实战 | 源码入门之Faster RCNN

计算机视觉life

19+阅读 · 2019年4月16日

CVPR2019 | Decoders 对于语义分割的重要性

CVPR2019 | Decoders 对于语义分割的重要性

计算机视觉life

48+阅读 · 2019年3月24日

一文带你读懂 SegNet（语义分割）

一文带你读懂 SegNet（语义分割）

AI研习社

19+阅读 · 2019年3月9日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

植物叶片多种色素高光谱遥感机理与模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA lncLCSC调控肝癌干细胞自我更新的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于注视集中度的驾驶员非注意状态检测研究

国家自然科学基金

0+阅读 · 2013年12月31日

甲状腺乳头状癌剪切波弹性成像与BRAF基因突变相关性及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

SAR图像配准中仿射不变特征提取和变换参数估计算法研究及快速实现

国家自然科学基金

0+阅读 · 2013年12月31日

肝脏移植供体CTA序列图像鲁棒自动分割方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

PSMB6在肝癌发生中的功能研究

国家自然科学基金

0+阅读 · 2010年12月31日

ClC-3氯通道调控多发性骨髓瘤细胞周期及机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

多核体系结构下变形物体实时连续碰撞检测算法研究

国家自然科学基金

0+阅读 · 2008年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

Lego-MT: Towards Detachable Models in Massively Multilingual Machine Translation

Arxiv

0+阅读 · 2023年5月29日

SSSegmenation: An Open Source Supervised Semantic Segmentation Toolbox Based on PyTorch

Arxiv

0+阅读 · 2023年5月26日

GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds

Arxiv

0+阅读 · 2023年5月25日

Improving Zero-shot Generalization and Robustness of Multi-modal Models

Improving Zero-shot Generalization and Robustness of Multi-modal Models

Arxiv

0+阅读 · 2023年5月25日

Lightweight network towards real-time image denoising on mobile devices

Arxiv

0+阅读 · 2023年5月25日

Knowledge Distillation with Deep Supervision

Arxiv

0+阅读 · 2023年5月25日

ACAI: Extending Arm Confidential Computing Architecture Protection from CPUs to Accelerators

Arxiv

0+阅读 · 2023年5月25日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

35+阅读 · 2022年4月25日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Arxiv

11+阅读 · 2019年11月25日

Single-Shot Object Detection with Enriched Semantics

Arxiv

11+阅读 · 2018年4月8日

VIP会员

文章信息

相关主题

相关VIP内容

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

专知会员服务

12+阅读 · 2022年3月9日

【ICML2021】生成式视频转换器Transformers: 物体可以是文字吗?

专知会员服务

13+阅读 · 2021年8月20日

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

专知会员服务

44+阅读 · 2021年3月15日

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

【厦门大学-CVPR2020】协调可迁移性与可判别性的自适应目标检测器，Adapting Object Detectors

专知会员服务

26+阅读 · 2020年3月16日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

专知会员服务

63+阅读 · 2020年2月16日

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

专知会员服务

27+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

PyTorch语义分割开源库semseg

PyTorch语义分割开源库semseg

极市平台

25+阅读 · 2019年6月6日

南邮提出实时语义分割的轻量级网络：LEDNET，可达 71 FPS！70.6% class mIoU！即将开源

南邮提出实时语义分割的轻量级网络：LEDNET，可达 71 FPS！70.6% class mIoU！即将开源

极市平台

17+阅读 · 2019年5月10日

实战 | 源码入门之Faster RCNN

实战 | 源码入门之Faster RCNN

计算机视觉life

19+阅读 · 2019年4月16日

CVPR2019 | Decoders 对于语义分割的重要性

CVPR2019 | Decoders 对于语义分割的重要性

计算机视觉life

48+阅读 · 2019年3月24日

一文带你读懂 SegNet（语义分割）

一文带你读懂 SegNet（语义分割）

AI研习社

19+阅读 · 2019年3月9日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

Lego-MT: Towards Detachable Models in Massively Multilingual Machine Translation

Arxiv

0+阅读 · 2023年5月29日

SSSegmenation: An Open Source Supervised Semantic Segmentation Toolbox Based on PyTorch

Arxiv

0+阅读 · 2023年5月26日

GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds

Arxiv

0+阅读 · 2023年5月25日

Improving Zero-shot Generalization and Robustness of Multi-modal Models

Improving Zero-shot Generalization and Robustness of Multi-modal Models

Arxiv

0+阅读 · 2023年5月25日

Lightweight network towards real-time image denoising on mobile devices

Arxiv

0+阅读 · 2023年5月25日

Knowledge Distillation with Deep Supervision

Arxiv

0+阅读 · 2023年5月25日

ACAI: Extending Arm Confidential Computing Architecture Protection from CPUs to Accelerators

Arxiv

0+阅读 · 2023年5月25日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

35+阅读 · 2022年4月25日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Arxiv

11+阅读 · 2019年11月25日

Single-Shot Object Detection with Enriched Semantics

Arxiv

11+阅读 · 2018年4月8日

相关基金

植物叶片多种色素高光谱遥感机理与模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA lncLCSC调控肝癌干细胞自我更新的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于注视集中度的驾驶员非注意状态检测研究

国家自然科学基金

0+阅读 · 2013年12月31日

甲状腺乳头状癌剪切波弹性成像与BRAF基因突变相关性及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

SAR图像配准中仿射不变特征提取和变换参数估计算法研究及快速实现

国家自然科学基金

0+阅读 · 2013年12月31日

肝脏移植供体CTA序列图像鲁棒自动分割方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

PSMB6在肝癌发生中的功能研究

国家自然科学基金

0+阅读 · 2010年12月31日

ClC-3氯通道调控多发性骨髓瘤细胞周期及机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

多核体系结构下变形物体实时连续碰撞检测算法研究

国家自然科学基金

0+阅读 · 2008年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员