多式联运帮助多式联运:采用多式模式的跨式小热学习 (Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models) - 专知论文

会员服务 ·

0

多峰值 · 小样本学习 · Learning · SimPLe · MoDELS ·

2023 年 1 月 29 日

Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models

翻译：多式联运帮助多式联运:采用多式模式的跨式小热学习

Zhiqiu Lin,Samuel Yu,Zhiyi Kuang,Deepak Pathak,Deva Ramanan

from arxiv, Project website: https://linzhiqiu.github.io/papers/cross_modal/

The ability to quickly learn a new task with minimal instruction - known as few-shot learning - is a central aspect of intelligent agents. Classical few-shot benchmarks make use of few-shot samples from a single modality, but such samples may not be sufficient to characterize an entire concept class. In contrast, humans use cross-modal information to learn new concepts efficiently. In this work, we demonstrate that one can indeed build a better ${\bf visual}$ dog classifier by ${\bf read}$ing about dogs and ${\bf listen}$ing to them bark. To do so, we exploit the fact that recent multimodal foundation models such as CLIP are inherently cross-modal, mapping different modalities to the same representation space. Specifically, we propose a simple cross-modal adaptation approach that learns from few-shot examples spanning different modalities. By repurposing class names as additional one-shot training samples, we achieve SOTA results with an embarrassingly simple linear classifier for vision-language adaptation. Furthermore, we show that our approach can benefit existing methods such as prefix tuning, adapters, and classifier ensembling. Finally, to explore other modalities beyond vision and language, we construct the first (to our knowledge) audiovisual few-shot benchmark and use cross-modal training to improve the performance of both image and audio classification.

翻译：快速学习具有最起码指导的新任务的能力 — — 被称为微小的学习 — — 是智能剂的核心方面。经典的微小基准利用了单一模式的少发样本, 但这类样本可能不足以描述整个概念类。相反, 人类使用跨模式信息来有效地学习新概念。在这项工作中, 我们证明, 一个人确实可以用$\bf读取狗和狗叫声等最差的狗, 并用$bf读取。为了做到这一点, 我们利用了以下事实: 最近的多式基础模型, 如CLIP 本质上是跨模式的, 将不同模式映射到相同的代表空间。具体地说, 我们建议一种简单的跨模式适应方法, 从几个例子中学习跨越不同模式的新概念。通过将班级名称重新定位为额外的一发培训样本, 我们用一个难堪的简单音线性分类器来实现SOTA结果。此外, 我们展示了我们的方法可以让现有的方法受益, 比如前的调法、调整者、调整者、和升级者将不同模式的模型改进我们的图像模式来探索其它方法。

0

相关内容

多峰值

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

混合光轴方向卷曲空间电磁特异介质的特性研究

国家自然科学基金

0+阅读 · 2015年12月31日

向量组合学习框架下基于依存混合树的中文语义解析研究

国家自然科学基金

3+阅读 · 2014年12月31日

多孔荧光配位聚合物的构筑及其对客体荧光传感研究

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

非对称导叶布局及其减振机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于语义理解的高分辨率遥感图像变化分析方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

模糊和随机环境下的粒计算与信息融合研究

国家自然科学基金

1+阅读 · 2012年12月31日

新型高速调制半导体激光器研究

国家自然科学基金

0+阅读 · 2012年12月31日

核-壳界面构造与量子点光学性质单分散

国家自然科学基金

0+阅读 · 2011年12月31日

利用超支化聚合物构建高效安全的非病毒基因载体

国家自然科学基金

0+阅读 · 2009年12月31日

Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition

Arxiv

2+阅读 · 2023年3月20日

Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing

Arxiv

0+阅读 · 2023年3月16日

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Arxiv

20+阅读 · 2023年2月1日

Multimodal Deep Learning

Arxiv

29+阅读 · 2023年1月12日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

小样本学习

相关VIP内容

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ACMMM2025教程】打击网络虚假信息视频：特征分析、检测与防范，170页ppt

海军无人系统：海上作战的演进而非革命

Nature 子刊 | SciToolAgent:知识图谱引导的科学工具智能体

多媒体顶会ACM Multimedia 2025各大奖项揭晓！格拉斯哥大学等获最佳论文，中科院自动化所等获最佳学生论文

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

相关论文

Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition

Arxiv

2+阅读 · 2023年3月20日

Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing

Arxiv

0+阅读 · 2023年3月16日

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Arxiv

20+阅读 · 2023年2月1日

Multimodal Deep Learning

Arxiv

29+阅读 · 2023年1月12日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Arxiv

25+阅读 · 2019年10月30日

相关基金

混合光轴方向卷曲空间电磁特异介质的特性研究

国家自然科学基金

0+阅读 · 2015年12月31日

向量组合学习框架下基于依存混合树的中文语义解析研究

国家自然科学基金

3+阅读 · 2014年12月31日

多孔荧光配位聚合物的构筑及其对客体荧光传感研究

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

非对称导叶布局及其减振机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于语义理解的高分辨率遥感图像变化分析方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

模糊和随机环境下的粒计算与信息融合研究

国家自然科学基金

1+阅读 · 2012年12月31日

新型高速调制半导体激光器研究

国家自然科学基金

0+阅读 · 2012年12月31日

核-壳界面构造与量子点光学性质单分散

国家自然科学基金

0+阅读 · 2011年12月31日

利用超支化聚合物构建高效安全的非病毒基因载体

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员