PRSeg: 基于轻量 Patch Rotate MLP 解码器的语义分割 (PRSeg: A Lightweight Patch Rotate MLP Decoder for Semantic Segmentation) - 专知论文

会员服务 ·

0

分割 · 语义分割 · 解码 · 通道 · 特征图 ·

2023 年 5 月 1 日

PRSeg: A Lightweight Patch Rotate MLP Decoder for Semantic Segmentation

翻译：PRSeg: 基于轻量 Patch Rotate MLP 解码器的语义分割

Yizhe Ma,Fangjian Lin,Sitong Wu,Shengwei Tian,Long Yu

from arxiv, Accepted by IEEE TCSVT

The lightweight MLP-based decoder has become increasingly promising for semantic segmentation. However, the channel-wise MLP cannot expand the receptive fields, lacking the context modeling capacity, which is critical to semantic segmentation. In this paper, we propose a parametric-free patch rotate operation to reorganize the pixels spatially. It first divides the feature map into multiple groups and then rotates the patches within each group. Based on the proposed patch rotate operation, we design a novel segmentation network, named PRSeg, which includes an off-the-shelf backbone and a lightweight Patch Rotate MLP decoder containing multiple Dynamic Patch Rotate Blocks (DPR-Blocks). In each DPR-Block, the fully connected layer is performed following a Patch Rotate Module (PRM) to exchange spatial information between pixels. Specifically, in PRM, the feature map is first split into the reserved part and rotated part along the channel dimension according to the predicted probability of the Dynamic Channel Selection Module (DCSM), and our proposed patch rotate operation is only performed on the rotated part. Extensive experiments on ADE20K, Cityscapes and COCO-Stuff 10K datasets prove the effectiveness of our approach. We expect that our PRSeg can promote the development of MLP-based decoder in semantic segmentation.

翻译：轻量级的基于 MLP 的解码器在语义分割领域变得越来越有前途。然而，通道级的 MLP 无法扩展感受野，缺乏上下文建模能力，这对于语义分割至关重要。在本文中，我们提出了一个无需参数的 Patch Rotate 操作来重新组织像素的空间。它首先将特征图分成多个组，并在每个组中旋转补丁。基于提出的 Patch Rotate 操作，我们设计了一种新的分割网络，名为 PRSeg，其中包括一个现成的骨干和一个轻量级的 Patch Rotate MLP 解码器，包含多个动态 Patch Rotate 块（DPR-Blocks）。在每个 DPR-Block 中，在 Patch Rotate 模块（PRM）之后执行全连接层以在像素之间交换空间信息。具体而言，在 PRM 中，特征图首先根据 Dynamic Channel Selection Module（DCSM）的预测概率沿通道维划分为保留部分和旋转部分，我们提出的 Patch Rotate 操作仅在旋转部分上执行。在 ADE20K、Cityscapes 和 COCO-Stuff 10K 数据集上的大量实验证明了我们方法的有效性。我们期望我们的 PRSeg 能够促进 MLP-based 解码器在语义分割中的发展。

1

相关内容

【CVPR2023】用于无监督域适应的Patch-Mix Transformer: 博弈视角

【CVPR2023】用于无监督域适应的Patch-Mix Transformer: 博弈视角

专知会员服务

30+阅读 · 2023年3月27日

【CVPR 2022】NUS&字节跳动提出Shunted Transformer：多尺度Token叠加

【CVPR 2022】NUS&字节跳动提出Shunted Transformer：多尺度Token叠加

专知会员服务

16+阅读 · 2022年4月8日

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

专知会员服务

44+阅读 · 2021年3月15日

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

专知会员服务

18+阅读 · 2020年8月23日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

NeurIPS'22 Spotlight｜华为诺亚GhostNetV2出炉：长距离注意力机制增强廉价操作

NeurIPS'22 Spotlight｜华为诺亚GhostNetV2出炉：长距离注意力机制增强廉价操作

极市平台

0+阅读 · 2022年11月15日

用最朴素的ViT，拿最新的SOTA！沈春华团队：搬来「ATM」打造语义分割的新范式！（NeurIPS 2022）

用最朴素的ViT，拿最新的SOTA！沈春华团队：搬来「ATM」打造语义分割的新范式！（NeurIPS 2022）

极市平台

0+阅读 · 2022年10月28日

ECCV 2022 | 港中文MMLab：基于Transformer的光流

ECCV 2022 | 港中文MMLab：基于Transformer的光流

PaperWeekly

0+阅读 · 2022年9月2日

PyTorch语义分割开源库semseg

PyTorch语义分割开源库semseg

极市平台

25+阅读 · 2019年6月6日

南邮提出实时语义分割的轻量级网络：LEDNET，可达 71 FPS！70.6% class mIoU！即将开源

南邮提出实时语义分割的轻量级网络：LEDNET，可达 71 FPS！70.6% class mIoU！即将开源

极市平台

17+阅读 · 2019年5月10日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

【泡泡一分钟】用于RGBD语义分割的三维图神经网络(ICCV2017-546)

【泡泡一分钟】用于RGBD语义分割的三维图神经网络(ICCV2017-546)

泡泡机器人SLAM

22+阅读 · 2018年12月4日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Blimp-1对小鼠allo-HSCT后GVHD发病的调控作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向众核处理器的HEVC并行编码关键技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于视觉感知的HEVC优化策略研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于3D-HEVC的上/下抽样非线性表示低复杂度的深度编码方法

国家自然科学基金

0+阅读 · 2013年12月31日

维持压缩率的JPEG图像选择性加密方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ad hoc网络中基于博弈论的激励合作路由算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

PTBP1介导的survivinΔEx3过表达调控胶质母细胞瘤微血管增生的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于广义建模理论的多原子库图像编码方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于超高分辨率视频的HEVC低复杂度模型和方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于全局通信管理的NoC低功耗容错机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

Transferring Knowledge for Food Image Segmentation using Transformers and Convolutions

Arxiv

0+阅读 · 2023年6月15日

A Mechanistic Transform Model for Synthesizing Eye Movement Data with Improved Realism

Arxiv

0+阅读 · 2023年6月14日

Single Motion Diffusion

Arxiv

0+阅读 · 2023年6月13日

Zero3D: Semantic-Driven Multi-Category 3D Shape Generation

Arxiv

0+阅读 · 2023年6月13日

Learning with Limited Annotations: A Survey on Deep Semi-Supervised Learning for Medical Image Segmentation

Learning with Limited Annotations: A Survey on Deep Semi-Supervised Learning for Medical Image Segmentation

Arxiv

13+阅读 · 2022年7月28日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Arxiv

11+阅读 · 2019年11月25日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR2023】用于无监督域适应的Patch-Mix Transformer: 博弈视角

【CVPR2023】用于无监督域适应的Patch-Mix Transformer: 博弈视角

专知会员服务

30+阅读 · 2023年3月27日

【CVPR 2022】NUS&字节跳动提出Shunted Transformer：多尺度Token叠加

【CVPR 2022】NUS&字节跳动提出Shunted Transformer：多尺度Token叠加

专知会员服务

16+阅读 · 2022年4月8日

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

专知会员服务

44+阅读 · 2021年3月15日

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

专知会员服务

18+阅读 · 2020年8月23日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

美军小型无人机项目

无人机蜂群——作为执行非常规战争的创新工具 | 2025最新文献

不确定环境下无人机与无人地面车辆编队的地下勘探规划算法 | 122页

接纳无人机多样性：西方军事在无人机战争中适应的五个挑战 | 28页报告

相关资讯

NeurIPS'22 Spotlight｜华为诺亚GhostNetV2出炉：长距离注意力机制增强廉价操作

NeurIPS'22 Spotlight｜华为诺亚GhostNetV2出炉：长距离注意力机制增强廉价操作

极市平台

0+阅读 · 2022年11月15日

用最朴素的ViT，拿最新的SOTA！沈春华团队：搬来「ATM」打造语义分割的新范式！（NeurIPS 2022）

用最朴素的ViT，拿最新的SOTA！沈春华团队：搬来「ATM」打造语义分割的新范式！（NeurIPS 2022）

极市平台

0+阅读 · 2022年10月28日

ECCV 2022 | 港中文MMLab：基于Transformer的光流

ECCV 2022 | 港中文MMLab：基于Transformer的光流

PaperWeekly

0+阅读 · 2022年9月2日

PyTorch语义分割开源库semseg

PyTorch语义分割开源库semseg

极市平台

25+阅读 · 2019年6月6日

南邮提出实时语义分割的轻量级网络：LEDNET，可达 71 FPS！70.6% class mIoU！即将开源

南邮提出实时语义分割的轻量级网络：LEDNET，可达 71 FPS！70.6% class mIoU！即将开源

极市平台

17+阅读 · 2019年5月10日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

【泡泡一分钟】用于RGBD语义分割的三维图神经网络(ICCV2017-546)

【泡泡一分钟】用于RGBD语义分割的三维图神经网络(ICCV2017-546)

泡泡机器人SLAM

22+阅读 · 2018年12月4日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Transferring Knowledge for Food Image Segmentation using Transformers and Convolutions

Arxiv

0+阅读 · 2023年6月15日

A Mechanistic Transform Model for Synthesizing Eye Movement Data with Improved Realism

Arxiv

0+阅读 · 2023年6月14日

Single Motion Diffusion

Arxiv

0+阅读 · 2023年6月13日

Zero3D: Semantic-Driven Multi-Category 3D Shape Generation

Arxiv

0+阅读 · 2023年6月13日

Learning with Limited Annotations: A Survey on Deep Semi-Supervised Learning for Medical Image Segmentation

Learning with Limited Annotations: A Survey on Deep Semi-Supervised Learning for Medical Image Segmentation

Arxiv

13+阅读 · 2022年7月28日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Arxiv

11+阅读 · 2019年11月25日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

相关基金

Blimp-1对小鼠allo-HSCT后GVHD发病的调控作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向众核处理器的HEVC并行编码关键技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于视觉感知的HEVC优化策略研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于3D-HEVC的上/下抽样非线性表示低复杂度的深度编码方法

国家自然科学基金

0+阅读 · 2013年12月31日

维持压缩率的JPEG图像选择性加密方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ad hoc网络中基于博弈论的激励合作路由算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

PTBP1介导的survivinΔEx3过表达调控胶质母细胞瘤微血管增生的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于广义建模理论的多原子库图像编码方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于超高分辨率视频的HEVC低复杂度模型和方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于全局通信管理的NoC低功耗容错机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员