MaPLe:多模式快速学习 (MaPLe: Multi-modal Prompt Learning) - 专知论文

会员服务 ·

0

Prompt · Learning · Branch · Performer · Vision ·

2022 年 10 月 6 日

MaPLe: Multi-modal Prompt Learning

翻译：MaPLe:多模式快速学习

Muhammad Uzair Khattak,Hanoona Rasheed,Muhammad Maaz,Salman Khan,Fahad Shahbaz Khan

from arxiv, Technical Report

Pre-trained vision-language (V-L) models such as CLIP have shown excellent generalization ability to downstream tasks. However, they are sensitive to the choice of input text prompts and require careful selection of prompt templates to perform well. Inspired by the Natural Language Processing (NLP) literature, recent CLIP adaptation approaches learn prompts as the textual inputs to fine-tune CLIP for downstream tasks. We note that using prompting to adapt representations in a single branch of CLIP (language or vision) is sub-optimal since it does not allow the flexibility to dynamically adjust both representation spaces on a downstream task. In this work, we propose Multi-modal Prompt Learning (MaPLe) for both vision and language branches to improve alignment between the vision and language representations. Our design promotes strong coupling between the vision-language prompts to ensure mutual synergy and discourages learning independent uni-modal solutions. Further, we learn separate prompts across different early stages to progressively model the stage-wise feature relationships to allow rich context learning. We evaluate the effectiveness of our approach on three representative tasks of generalization to novel classes, new target datasets and unseen domain shifts. Compared with the state-of-the-art method Co-CoOp, MaPLe exhibits favorable performance and achieves an absolute gain of 3.45% on novel classes and 2.72% on overall harmonic-mean, averaged over 11 diverse image recognition datasets. Code: https://tinyurl.com/2dzs8f3w.

翻译：诸如 CLIP 等经过预先培训的视觉语言模型(V-L) 显示对下游任务具有极佳的概括性能力。但是,它们对于选择输入文本的提示很敏感,需要仔细选择迅速的模板才能很好地发挥作用。在自然语言处理文献的启发下, CLIP 适应方法最近学习了作为微调 CLIP 的文字投入的提示,用于下游任务的微调 CLIP 的微调 CLIP 。我们注意到, 利用快速使 CLIP (语言或愿景) 的单个分支的表达方式适应性是次优的,因为它不允许在下游任务中动态地调整两个代表空间的灵活性。在这项工作中,我们提议为视觉和语言分支提供多式快速学习(MAPL ) 。我们的设计促进了视觉语言提示之间的强烈组合,以确保相互协同,并阻碍学习独立的单式解决方案。此外,我们在不同早期阶段分别学习了不同的提示, 以逐步模拟阶段性特征关系, 以便进行丰富的背景学习。我们评估了我们三种具有代表性的通用图象学分级图象学分级图的3级图级图级、目标图象级图象级图级图级图级图象学分级图级图案。

0

相关内容

Prompt

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

多任务学习的理论分析与应用

国家自然科学基金

6+阅读 · 2013年12月31日

基于刚度模型的机器人误差建模及标定方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

非线性离散可积方程与离散Painlevé方程族的连续极限理论

国家自然科学基金

0+阅读 · 2013年12月31日

纺织品纹理双面融合与共轭特征识别关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

通过Mitophagy维持mtDNA稳定性在肥胖/T2DM加速脑老化中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Exosomes分泌途径在调节肝癌细胞microRNA表达谱中的作用与机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tetrolet变换的偏振遥感图像融合算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

多操作空间强非线性系统自适应模型辨识的子空间方法

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

Few-shot Learning with Multilingual Language Models

Arxiv

0+阅读 · 2022年11月10日

Zero-Label Prompt Selection

Arxiv

0+阅读 · 2022年11月9日

Prompt-Based Metric Learning for Few-Shot NER

Prompt-Based Metric Learning for Few-Shot NER

Arxiv

1+阅读 · 2022年11月8日

Multi-Stage Based Feature Fusion of Multi-Modal Data for Human Activity Recognition

Arxiv

0+阅读 · 2022年11月8日

Mutual Information-guided Knowledge Transfer for Novel Class Discovery

Arxiv

0+阅读 · 2022年11月8日

ConsPrompt: Easily Exploiting Contrastive Samples for Few-shot Prompt Learning

Arxiv

0+阅读 · 2022年11月8日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Arxiv

13+阅读 · 2021年4月7日

Self-supervised Learning: Generative or Contrastive

Arxiv

19+阅读 · 2020年7月21日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

相关论文

Few-shot Learning with Multilingual Language Models

Arxiv

0+阅读 · 2022年11月10日

Zero-Label Prompt Selection

Arxiv

0+阅读 · 2022年11月9日

Prompt-Based Metric Learning for Few-Shot NER

Prompt-Based Metric Learning for Few-Shot NER

Arxiv

1+阅读 · 2022年11月8日

Multi-Stage Based Feature Fusion of Multi-Modal Data for Human Activity Recognition

Arxiv

0+阅读 · 2022年11月8日

Mutual Information-guided Knowledge Transfer for Novel Class Discovery

Arxiv

0+阅读 · 2022年11月8日

ConsPrompt: Easily Exploiting Contrastive Samples for Few-shot Prompt Learning

Arxiv

0+阅读 · 2022年11月8日

Self-Supervised Learning via Maximum Entropy Coding

Arxiv

13+阅读 · 2022年10月20日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Arxiv

13+阅读 · 2021年4月7日

Self-supervised Learning: Generative or Contrastive

Arxiv

19+阅读 · 2020年7月21日

相关基金

多任务学习的理论分析与应用

国家自然科学基金

6+阅读 · 2013年12月31日

基于刚度模型的机器人误差建模及标定方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

非线性离散可积方程与离散Painlevé方程族的连续极限理论

国家自然科学基金

0+阅读 · 2013年12月31日

纺织品纹理双面融合与共轭特征识别关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

通过Mitophagy维持mtDNA稳定性在肥胖/T2DM加速脑老化中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Exosomes分泌途径在调节肝癌细胞microRNA表达谱中的作用与机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tetrolet变换的偏振遥感图像融合算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

上下文感知的Web服务自适应计算模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

多操作空间强非线性系统自适应模型辨识的子空间方法

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员