用于语义图像分割的全变换网络 (Fully Transformer Networks for Semantic Image Segmentation) - 专知论文

会员服务 ·

0

图像分割 · 变换 · Pyramid · 语义分割 · CC ·

2021 年 8 月 26 日

Fully Transformer Networks for Semantic Image Segmentation

翻译：用于语义图像分割的全变换网络

Sitong Wu,Tianyi Wu,Fangjian Lin,Shengwei Tian,Guodong Guo

Transformers have shown impressive performance in various natural language processing and computer vision tasks, due to the capability of modeling long-range dependencies. Recent progress has demonstrated to combine such transformers with CNN-based semantic image segmentation models is very promising. However, it is not well studied yet on how well a pure transformer based approach can achieve for image segmentation. In this work, we explore a novel framework for semantic image segmentation, which is encoder-decoder based Fully Transformer Networks (FTN). Specifically, we first propose a Pyramid Group Transformer (PGT) as the encoder for progressively learning hierarchical features, while reducing the computation complexity of the standard visual transformer(ViT). Then, we propose a Feature Pyramid Transformer (FPT) to fuse semantic-level and spatial-level information from multiple levels of the PGT encoder for semantic image segmentation. Surprisingly, this simple baseline can achieve new state-of-the-art results on multiple challenging semantic segmentation benchmarks, including PASCAL Context, ADE20K and COCO-Stuff. The source code will be released upon the publication of this work.

翻译：在各种自然语言处理和计算机视觉任务中,由于模拟长距离依赖性的能力,变异器在各种自然语言处理和计算机视觉任务中表现出了令人印象深刻的性能。最近的进展表明,将这类变异器与CNN的语义图像分解模型相结合是非常有希望的。然而,对纯变异器在图像分化方面能取得多大的效益,还没有进行充分的研究。在这项工作中,我们探索了一种基于全变异器网络(FTN)的语义图像分解新框架。具体地说,我们首先建议用一个金字形集团变异器作为逐渐学习等级特征的编码器,同时降低标准视觉变异器(VIT)的计算复杂性。然后,我们建议用一个Fetatic Pyramid变异器(FPTT) 来将语义级和空间级信息从PGT T 图像分解的多层融合起来。令人惊讶的是,这一简单基线可以取得关于多重具有挑战性的语义分解基准的新的状态结果,包括PASAL背景、ADE20K和CO的源码。

0

相关内容

图像分割

图像分割就是把图像分成若干个特定的、具有独特性质的区域并提出感兴趣目标的技术和过程。它是由图像处理到图像分析的关键步骤。所谓图像分割指的是根据灰度、颜色、纹理和形状等特征把图像划分成若干互不交迭的区域，并使这些特征在同一区域内呈现出相似性，而在不同区域间呈现出明显的差异性。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

[ICCV 2021] 从二到一：一种带有视觉语言建模网络的新场景文本识别器

专知会员服务

17+阅读 · 2021年10月17日

清华大学最新《机器学习的视觉分析技术》综述论文，34页pdf

清华大学最新《机器学习的视觉分析技术》综述论文，34页pdf

专知会员服务

47+阅读 · 2020年12月2日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

【图神经网络概览】《Graph Neural Networks - An overview | AI Summer》

【图神经网络概览】《Graph Neural Networks - An overview | AI Summer》

专知会员服务

54+阅读 · 2020年2月18日

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

专知会员服务

64+阅读 · 2020年2月16日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

专知会员服务

54+阅读 · 2019年11月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

一文带你读懂 SegNet（语义分割）

一文带你读懂 SegNet（语义分割）

AI研习社

19+阅读 · 2019年3月9日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

Image Segmentation：Konica Minolta 的图像分割异常挑战赛

Image Segmentation：Konica Minolta 的图像分割异常挑战赛

AI研习社

3+阅读 · 2018年1月25日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

UNETR: Transformers for 3D Medical Image Segmentation

Arxiv

0+阅读 · 2021年10月9日

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Arxiv

10+阅读 · 2020年12月31日

Object-Contextual Representations for Semantic Segmentation

Object-Contextual Representations for Semantic Segmentation

Arxiv

7+阅读 · 2019年11月19日

LadderNet: Multi-path networks based on U-Net for medical image segmentation

LadderNet: Multi-path networks based on U-Net for medical image segmentation

Arxiv

7+阅读 · 2019年8月28日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation

FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation

Arxiv

8+阅读 · 2019年2月8日

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Arxiv

5+阅读 · 2019年1月10日

ShelfNet for Real-time Semantic Segmentation

Arxiv

7+阅读 · 2018年12月10日

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Arxiv

8+阅读 · 2018年2月7日

Fully Convolutional Networks for Semantic Segmentation

Arxiv

3+阅读 · 2015年3月8日

VIP会员

文章信息

相关主题

相关VIP内容

[ICCV 2021] 从二到一：一种带有视觉语言建模网络的新场景文本识别器

专知会员服务

17+阅读 · 2021年10月17日

清华大学最新《机器学习的视觉分析技术》综述论文，34页pdf

清华大学最新《机器学习的视觉分析技术》综述论文，34页pdf

专知会员服务

47+阅读 · 2020年12月2日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

【图神经网络概览】《Graph Neural Networks - An overview | AI Summer》

【图神经网络概览】《Graph Neural Networks - An overview | AI Summer》

专知会员服务

54+阅读 · 2020年2月18日

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

基于深度学习的图像语义分割技术研究进展，Research on Progress of Image Semantic Segmentation Based on Deep Learning

专知会员服务

64+阅读 · 2020年2月16日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

专知会员服务

54+阅读 · 2019年11月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

一文带你读懂 SegNet（语义分割）

一文带你读懂 SegNet（语义分割）

AI研习社

19+阅读 · 2019年3月9日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

Image Segmentation：Konica Minolta 的图像分割异常挑战赛

Image Segmentation：Konica Minolta 的图像分割异常挑战赛

AI研习社

3+阅读 · 2018年1月25日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

UNETR: Transformers for 3D Medical Image Segmentation

Arxiv

0+阅读 · 2021年10月9日

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Arxiv

10+阅读 · 2020年12月31日

Object-Contextual Representations for Semantic Segmentation

Object-Contextual Representations for Semantic Segmentation

Arxiv

7+阅读 · 2019年11月19日

LadderNet: Multi-path networks based on U-Net for medical image segmentation

LadderNet: Multi-path networks based on U-Net for medical image segmentation

Arxiv

7+阅读 · 2019年8月28日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation

FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation

Arxiv

8+阅读 · 2019年2月8日

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Arxiv

5+阅读 · 2019年1月10日

ShelfNet for Real-time Semantic Segmentation

Arxiv

7+阅读 · 2018年12月10日

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Arxiv

8+阅读 · 2018年2月7日

Fully Convolutional Networks for Semantic Segmentation

Arxiv

3+阅读 · 2015年3月8日

微信扫码咨询专知VIP会员