与愿景变异器组合最大最大组合组合调和等级和形状,形成监管不力的语义分割法 (Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation) - 专知论文

会员服务 ·

0

最大汇聚 · 变换 · 类别 · 塑造 · 汇聚 ·

2022 年 10 月 31 日

Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation

翻译：与愿景变异器组合最大最大组合组合调和等级和形状,形成监管不力的语义分割法

Simone Rossetti,Damiano Zappia,Marta Sanzari,Marco Schaerf,Fiora Pirri

from arxiv, 28 pages, 9 images, ECCV 2022 conference

Weakly Supervised Semantic Segmentation (WSSS) research has explored many directions to improve the typical pipeline CNN plus class activation maps (CAM) plus refinements, given the image-class label as the only supervision. Though the gap with the fully supervised methods is reduced, further abating the spread seems unlikely within this framework. On the other hand, WSSS methods based on Vision Transformers (ViT) have not yet explored valid alternatives to CAM. ViT features have been shown to retain a scene layout, and object boundaries in self-supervised learning. To confirm these findings, we prove that the advantages of transformers in self-supervised methods are further strengthened by Global Max Pooling (GMP), which can leverage patch features to negotiate pixel-label probability with class probability. This work proposes a new WSSS method dubbed ViT-PCM (ViT Patch-Class Mapping), not based on CAM. The end-to-end presented network learns with a single optimization process, refined shape and proper localization for segmentation masks. Our model outperforms the state-of-the-art on baseline pseudo-masks (BPM), where we achieve $69.3\%$ mIoU on PascalVOC 2012 $val$ set. We show that our approach has the least set of parameters, though obtaining higher accuracy than all other approaches. In a sentence, quantitative and qualitative results of our method reveal that ViT-PCM is an excellent alternative to CNN-CAM based architectures.

翻译：微弱监督的语义分解( WSSS) 研究探索了许多方向来改进典型的管道 CNN + 类动动地图( CAM) 以及改进,因为图像级标签是唯一的监管。虽然与完全监督方法的差距缩小了, 但在此框架内进一步减少扩散的可能性似乎不太可能。另一方面, 以愿景变换器( Viet- Patch-Class映射) 为基础的WSS方法尚未探索 CAM 的有效替代方法。 ViT 特征显示保留了场景布局, 并在自监督的学习中保留了对象界限。为了证实这些发现, 我们证明, 自我监督方法中的变异器的优势得到了全球马克斯( GMP) 的进一步加强, 它可以利用补丁特性与等级概率谈判像标的概率。这项工作提出了一个新的基于ViT- PC- PC( ViT Patch-Class映射) 的WSS 方法, 以单一的优化进程、改进的形状和适当的本地化替代面罩。我们的模型超越了2012年的州- PRMM-M-M- mark 基本方法。

0

相关内容

最大汇聚

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

分子识别功能纳米核壳组装体构造及其金属增强荧光效应

国家自然科学基金

0+阅读 · 2015年12月31日

有理映射的参数空间

国家自然科学基金

0+阅读 · 2013年12月31日

基于蛙眼视觉模型的运动目标检测、跟踪及交通场景分析方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于木质纤维素选择性分离的深共熔体系的构效关系和分子模拟

国家自然科学基金

0+阅读 · 2012年12月31日

膜蛋白Leptosphaeria rhodopsin二聚化组装的结构机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

视神经脊髓炎结构和数字工作记忆多模态MRI研究

国家自然科学基金

0+阅读 · 2012年12月31日

放电等离子体烧结CaCu3Ti4O12陶瓷及其高介电行为研究

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Which Pixel to Annotate: a Label-Efficient Nuclei Segmentation Framework

Arxiv

0+阅读 · 2022年12月20日

AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation

Arxiv

0+阅读 · 2022年12月17日

Rethinking Cooking State Recognition with Vision Transformers

Rethinking Cooking State Recognition with Vision Transformers

Arxiv

0+阅读 · 2022年12月16日

ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation

Arxiv

0+阅读 · 2022年12月16日

Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation

Arxiv

0+阅读 · 2022年12月16日

HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval

Arxiv

0+阅读 · 2022年12月16日

Boosting Semi-Supervised Semantic Segmentation with Probabilistic Representations

Arxiv

0+阅读 · 2022年12月16日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】面向时间序列基础模型的合成序列符号数据生成方法

军事通信市场七大趋势概述

【CMU博士论文】深度学习中泛化的量化、理解与改进

面向低光照图像增强的扩散模型

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Which Pixel to Annotate: a Label-Efficient Nuclei Segmentation Framework

Arxiv

0+阅读 · 2022年12月20日

AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation

Arxiv

0+阅读 · 2022年12月17日

Rethinking Cooking State Recognition with Vision Transformers

Rethinking Cooking State Recognition with Vision Transformers

Arxiv

0+阅读 · 2022年12月16日

ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation

Arxiv

0+阅读 · 2022年12月16日

Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation

Arxiv

0+阅读 · 2022年12月16日

HGAN: Hierarchical Graph Alignment Network for Image-Text Retrieval

Arxiv

0+阅读 · 2022年12月16日

Boosting Semi-Supervised Semantic Segmentation with Probabilistic Representations

Arxiv

0+阅读 · 2022年12月16日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

分子识别功能纳米核壳组装体构造及其金属增强荧光效应

国家自然科学基金

0+阅读 · 2015年12月31日

有理映射的参数空间

国家自然科学基金

0+阅读 · 2013年12月31日

基于蛙眼视觉模型的运动目标检测、跟踪及交通场景分析方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

用于木质纤维素选择性分离的深共熔体系的构效关系和分子模拟

国家自然科学基金

0+阅读 · 2012年12月31日

膜蛋白Leptosphaeria rhodopsin二聚化组装的结构机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

视神经脊髓炎结构和数字工作记忆多模态MRI研究

国家自然科学基金

0+阅读 · 2012年12月31日

放电等离子体烧结CaCu3Ti4O12陶瓷及其高介电行为研究

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员