基于文本引导的视觉基础模型的遥感图像语义分割(Text2Seg) (Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models) - 专知论文

会员服务 ·

0

图像语义 · 图像语义分割 · 遥感图像 · 分割 · 大模型 ·

2023 年 4 月 20 日

Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models

翻译：基于文本引导的视觉基础模型的遥感图像语义分割(Text2Seg)

Jielu Zhang,Zhongliang Zhou,Gengchen Mai,Lan Mu,Mengxuan Hu,Sheng Li

from arxiv, 10 pages, 6 figures

Recent advancements in foundation models (FMs), such as GPT-4 and LLaMA, have attracted significant attention due to their exceptional performance in zero-shot learning scenarios. Similarly, in the field of visual learning, models like Grounding DINO and the Segment Anything Model (SAM) have exhibited remarkable progress in open-set detection and instance segmentation tasks. It is undeniable that these FMs will profoundly impact a wide range of real-world visual learning tasks, ushering in a new paradigm shift for developing such models. In this study, we concentrate on the remote sensing domain, where the images are notably dissimilar from those in conventional scenarios. We developed a pipeline that leverages multiple FMs to facilitate remote sensing image semantic segmentation tasks guided by text prompt, which we denote as Text2Seg. The pipeline is benchmarked on several widely-used remote sensing datasets, and we present preliminary results to demonstrate its effectiveness. Through this work, we aim to provide insights into maximizing the applicability of visual FMs in specific contexts with minimal model tuning. The code is available at https://github.com/Douglas2Code/Text2Seg.

翻译：近年来，基础模型（FMs），如GPT-4和LLaMA的最新进展在零样本学习场景中表现出了卓越的性能，引起了广泛关注。同样，在视觉学习领域，Grounding DINO和Segment Anything Model（SAM）等模型在开放式检测和实例分割任务中展现出了显着的进展。这些基础模型无疑将深刻影响各种现实世界的视觉学习任务，开启开发此类模型的新范式。在本研究中，我们着重研究遥感领域，其中图像与传统场景中的图像明显不同。我们开发了一种利用多个基础模型的管道来促进文本提示引导的遥感图像语义分割任务的方法，我们将其称为Text2Seg。在几个广泛使用的遥感数据集上进行基准测试，并展示初步结果以证明其有效性。通过这项工作，我们旨在提供关于在特定情境下最大化视觉基础模型的适用性且最小化模型调整的见解。代码在 https://github.com/Douglas2Code/Text2Seg 上可用。

2

相关内容

图像语义

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

专知会员服务

32+阅读 · 2022年3月12日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

专知会员服务

93+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

专知会员服务

54+阅读 · 2019年11月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

基于自学习对比度视觉注意模型和自适应深度特征的无分类目标检测

国家自然科学基金

2+阅读 · 2015年12月31日

融合目标感知与对比度的图像和视频显著性检测技术研究

国家自然科学基金

4+阅读 · 2015年12月31日

数据驱动的室内场景设计与建模

国家自然科学基金

1+阅读 · 2014年12月31日

时敏目标天/空在线检测与识别

国家自然科学基金

7+阅读 · 2013年12月31日

社会化多模式路径规划服务中的数据模型与算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

视觉注意与人脑记忆机制启发下的感兴趣目标提取与跟踪

国家自然科学基金

1+阅读 · 2012年12月31日

钙钛矿型材料中磁电耦合的第一性原理研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向云计算的图像同态加密与高效检索关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

球面全方位目标测量及跟踪理论

国家自然科学基金

0+阅读 · 2012年12月31日

钙钛矿型铁电/铁磁薄膜异质结的界面微观结构与磁电耦合性能的关联性

国家自然科学基金

0+阅读 · 2008年12月31日

Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt

Arxiv

0+阅读 · 2023年6月7日

Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation

Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation

Arxiv

0+阅读 · 2023年6月6日

Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

Arxiv

0+阅读 · 2023年6月5日

LiT-4-RSVQA: Lightweight Transformer-based Visual Question Answering in Remote Sensing

Arxiv

0+阅读 · 2023年6月2日

Transformer-Based Visual Segmentation: A Survey

Arxiv

0+阅读 · 2023年6月2日

Transformers in Remote Sensing: A Survey

Transformers in Remote Sensing: A Survey

Arxiv

25+阅读 · 2022年9月2日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Image Segmentation Using Deep Learning: A Survey

Arxiv

17+阅读 · 2020年11月15日

TensorMask: A Foundation for Dense Object Segmentation

TensorMask: A Foundation for Dense Object Segmentation

Arxiv

10+阅读 · 2019年3月28日

Exploring Models and Data for Remote Sensing Image Caption Generation

Arxiv

14+阅读 · 2017年12月21日

VIP会员

文章信息

相关主题

图像语义分割

相关VIP内容

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

专知会员服务

32+阅读 · 2022年3月12日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

深度学习图像分割综述论文最新版，Image Segmentation Using Deep Learning: A Survey

专知会员服务

93+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

【图像分割| 2019最新综述】自然图像和医学图像的深层语义分割，附21页PDF（Deep Semantic Segmentation of Natural and Medical Images: A Review）

专知会员服务

54+阅读 · 2019年11月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Integrating Geometric Control into Text-to-Image Diffusion Models for High-Quality Detection Data Generation via Text Prompt

Arxiv

0+阅读 · 2023年6月7日

Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation

Conditional Diffusion Models for Weakly Supervised Medical Image Segmentation

Arxiv

0+阅读 · 2023年6月6日

Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

Arxiv

0+阅读 · 2023年6月5日

LiT-4-RSVQA: Lightweight Transformer-based Visual Question Answering in Remote Sensing

Arxiv

0+阅读 · 2023年6月2日

Transformer-Based Visual Segmentation: A Survey

Arxiv

0+阅读 · 2023年6月2日

Transformers in Remote Sensing: A Survey

Transformers in Remote Sensing: A Survey

Arxiv

25+阅读 · 2022年9月2日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Image Segmentation Using Deep Learning: A Survey

Arxiv

17+阅读 · 2020年11月15日

TensorMask: A Foundation for Dense Object Segmentation

TensorMask: A Foundation for Dense Object Segmentation

Arxiv

10+阅读 · 2019年3月28日

Exploring Models and Data for Remote Sensing Image Caption Generation

Arxiv

14+阅读 · 2017年12月21日

相关基金

基于自学习对比度视觉注意模型和自适应深度特征的无分类目标检测

国家自然科学基金

2+阅读 · 2015年12月31日

融合目标感知与对比度的图像和视频显著性检测技术研究

国家自然科学基金

4+阅读 · 2015年12月31日

数据驱动的室内场景设计与建模

国家自然科学基金

1+阅读 · 2014年12月31日

时敏目标天/空在线检测与识别

国家自然科学基金

7+阅读 · 2013年12月31日

社会化多模式路径规划服务中的数据模型与算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

视觉注意与人脑记忆机制启发下的感兴趣目标提取与跟踪

国家自然科学基金

1+阅读 · 2012年12月31日

钙钛矿型材料中磁电耦合的第一性原理研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向云计算的图像同态加密与高效检索关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

球面全方位目标测量及跟踪理论

国家自然科学基金

0+阅读 · 2012年12月31日

钙钛矿型铁电/铁磁薄膜异质结的界面微观结构与磁电耦合性能的关联性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员