CLIP 也是一种高效的分割线:一种文字驱动方法,用于薄弱的、受监督的语义分割线。 (CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation) - 专知论文

会员服务 ·

0

监督 · Softmax函数/软最大化函数 · 可约的 · 标注 · MoDELS ·

2022 年 12 月 20 日

CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation

翻译：CLIP 也是一种高效的分割线:一种文字驱动方法,用于薄弱的、受监督的语义分割线。

Yuqi Lin,Minghao Chen,Wenxiao Wang,Boxi Wu,Ke Li,Binbin Lin,Haifeng Liu,Xiaofei He

from arxiv, 13 pages, 8 figures

Weakly supervised semantic segmentation (WSSS) with image-level labels is a challenging task in computer vision. Mainstream approaches follow a multi-stage framework and suffer from high training costs. In this paper, we explore the potential of Contrastive Language-Image Pre-training models (CLIP) to localize different categories with only image-level labels and without any further training. To efficiently generate high-quality segmentation masks from CLIP, we propose a novel framework called CLIP-ES for WSSS. Our framework improves all three stages of WSSS with special designs for CLIP: 1) We introduce the softmax function into GradCAM and exploit the zero-shot ability of CLIP to suppress the confusion caused by non-target classes and backgrounds. Meanwhile, to take full advantage of CLIP, we re-explore text inputs under the WSSS setting and customize two text-driven strategies: sharpness-based prompt selection and synonym fusion. 2) To simplify the stage of CAM refinement, we propose a real-time class-aware attention-based affinity (CAA) module based on the inherent multi-head self-attention (MHSA) in CLIP-ViTs. 3) When training the final segmentation model with the masks generated by CLIP, we introduced a confidence-guided loss (CGL) to mitigate noise and focus on confident regions. Our proposed framework dramatically reduces the cost of training for WSSS and shows the capability of localizing objects in CLIP. Our CLIP-ES achieves SOTA performance on Pascal VOC 2012 and MS COCO 2014 while only taking 10% time of previous methods for the pseudo mask generation. Code is available at https://github.com/linyq2117/CLIP-ES.

翻译：在计算机愿景中,主流方法遵循多阶段框架,并承受高培训成本。在本文件中,我们探索了“对比语言图像预培训模型”将不同类别本地化的潜力,仅使用图像级标签,不经过任何进一步培训。为了高效生成CLIP的高品质分解掩码,我们提议了一个名为“为SSSS提供CLIP-ES”的新框架。我们的框架改进了SSS的所有三个阶段,为CLIP提供了特殊设计:(1) 我们向格拉德卡姆引入软模具功能,并利用CLIP的零弹射能力来抑制非目标类和背景造成的混乱。与此同时,为了充分利用CLIP,我们根据CLIP设置并定制了两种文本驱动战略:基于清晰度的快速选择和同义调解密。为了简化CAM的改进阶段,我们提议在基于我们CIS-LOL的常规培训模块中实时地降低成本。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

煤颗粒燃烧产物时空动态特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于P2Y1受体介导的海马区胶质细胞神经炎性反应调控探讨电针对MCAO大鼠认知功能改善的实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

功能化的氧化石墨烯诱导沸石合成及表面负载

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白激酶LIMK1活性在小鼠卵母细胞染色体分离过程中的作用和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

AngⅡ通过Bcl-2/Beclin1自噬途径调控血管内皮细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

FOXO1在TNFR-Fc抑制急性肺损伤肺泡上皮细胞凋亡中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

缺血性脑损伤介导的ErbB4胞内结构域分解的分子机制及作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

HIF-1αAPK通路在微波辐射致海马线粒体损伤中的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

A Convolutional Vision Transformer for Semantic Segmentation of Side-Scan Sonar Data

Arxiv

0+阅读 · 2023年2月24日

Benchmarking the Robustness of LiDAR Semantic Segmentation Models

Arxiv

1+阅读 · 2023年2月24日

Side Adapter Network for Open-Vocabulary Semantic Segmentation

Side Adapter Network for Open-Vocabulary Semantic Segmentation

Arxiv

0+阅读 · 2023年2月23日

Pixel Difference Convolutional Network for RGB-D Semantic Segmentation

Arxiv

0+阅读 · 2023年2月23日

Positive-Negative Equal Contrastive Loss for Semantic Segmentation

Arxiv

0+阅读 · 2023年2月23日

Confidence-Guided Data Augmentation for Improved Semi-Supervised Training

Arxiv

0+阅读 · 2023年2月22日

Using Semantic Information for Defining and Detecting OOD Inputs

Arxiv

0+阅读 · 2023年2月21日

ContrastMask: Contrastive Learning to Segment Every Thing

Arxiv

15+阅读 · 2022年3月18日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Arxiv

11+阅读 · 2019年11月25日

VIP会员

文章信息

相关主题

Softmax函数/软最大化函数

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

相关论文

A Convolutional Vision Transformer for Semantic Segmentation of Side-Scan Sonar Data

Arxiv

0+阅读 · 2023年2月24日

Benchmarking the Robustness of LiDAR Semantic Segmentation Models

Arxiv

1+阅读 · 2023年2月24日

Side Adapter Network for Open-Vocabulary Semantic Segmentation

Side Adapter Network for Open-Vocabulary Semantic Segmentation

Arxiv

0+阅读 · 2023年2月23日

Pixel Difference Convolutional Network for RGB-D Semantic Segmentation

Arxiv

0+阅读 · 2023年2月23日

Positive-Negative Equal Contrastive Loss for Semantic Segmentation

Arxiv

0+阅读 · 2023年2月23日

Confidence-Guided Data Augmentation for Improved Semi-Supervised Training

Arxiv

0+阅读 · 2023年2月22日

Using Semantic Information for Defining and Detecting OOD Inputs

Arxiv

0+阅读 · 2023年2月21日

ContrastMask: Contrastive Learning to Segment Every Thing

Arxiv

15+阅读 · 2022年3月18日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Arxiv

11+阅读 · 2019年11月25日

相关基金

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

煤颗粒燃烧产物时空动态特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于P2Y1受体介导的海马区胶质细胞神经炎性反应调控探讨电针对MCAO大鼠认知功能改善的实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

功能化的氧化石墨烯诱导沸石合成及表面负载

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白激酶LIMK1活性在小鼠卵母细胞染色体分离过程中的作用和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

AngⅡ通过Bcl-2/Beclin1自噬途径调控血管内皮细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

FOXO1在TNFR-Fc抑制急性肺损伤肺泡上皮细胞凋亡中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

缺血性脑损伤介导的ErbB4胞内结构域分解的分子机制及作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

HIF-1αAPK通路在微波辐射致海马线粒体损伤中的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员