黑盒子视觉语言模型的少样本适应 (Black Box Few-Shot Adaptation for Vision-Language models) - 专知论文

会员服务 ·

0

提示学习 · 访问模型 · 黑盒 · 黑盒子 · 视觉语言模型 ·

2023 年 4 月 4 日

Black Box Few-Shot Adaptation for Vision-Language models

翻译：黑盒子视觉语言模型的少样本适应

Yassine Ouali,Adrian Bulat,Brais Martinez,Georgios Tzimiropoulos

Vision-Language (V-L) models trained with contrastive learning to align the visual and language modalities have been shown to be strong few-shot learners. Soft prompt learning is the method of choice for few-shot downstream adaption aiming to bridge the modality gap caused by the distribution shift induced by the new domain. While parameter-efficient, prompt learning still requires access to the model weights and can be computationally infeasible for large models with billions of parameters. To address these shortcomings, in this work, we describe a black-box method for V-L few-shot adaptation that (a) operates on pre-computed image and text features and hence works without access to the model's weights, (b) it is orders of magnitude faster at training time, (c) it is amenable to both supervised and unsupervised training, and (d) it can be even used to align image and text features computed from uni-modal models. To achieve this, we propose Linear Feature Alignment (LFA), a simple linear approach for V-L re-alignment in the target domain. LFA is initialized from a closed-form solution to a least-squares problem and then it is iteratively updated by minimizing a re-ranking loss. Despite its simplicity, our approach can even surpass soft-prompt learning methods as shown by extensive experiments on 11 image and 2 video datasets.

翻译：利用对比学习训练的视觉语言（V-L）模型已被证明是强大的few-shot 学习器。软提示学习是few-shot下游适应的首选方法，旨在弥合由新域引起的分布变化导致的模态差距。尽管是参数效率的，但提示学习仍需要访问模型权重，并且对于具有数十亿个参数的大型模型而言，计算代价可能是不可行的。为了解决这些缺点，在本文中，我们提出了一种用于 V-L 少样本适应的黑盒方法，其（a）在预先计算的图像和文本特征上运行，因此可以在没有访问模型权重的情况下工作；（b）训练时速度快几个数量级；（c）对有监督和无监督训练都可行；（d）甚至可以用于对来自单模型模型的图像和文本特征进行对齐。为了实现此目标，我们提出了线性特征对齐（LFA），一种用于在目标域中进行 V-L 重新对齐的简单线性方法。 LFA 从最小二乘问题的封闭形式解决方案初始化，然后通过最小化重新排名损失进行迭代更新。尽管其简单性，我们的方法甚至可以超过软提示学习方法，如在对 11 个图像和 2 个视频数据集进行的大量实验中所示。

0

相关内容

提示学习

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

专知会员服务

31+阅读 · 2023年4月7日

【南洋理工-CVPR2022】视觉语言模型的条件提示学习

【南洋理工-CVPR2022】视觉语言模型的条件提示学习

专知会员服务

34+阅读 · 2022年3月13日

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

专知会员服务

32+阅读 · 2022年3月12日

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

专知会员服务

14+阅读 · 2022年3月12日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

专知会员服务

71+阅读 · 2020年4月20日

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

专知会员服务

29+阅读 · 2020年4月6日

【CVPR2020-加州理工大学Devi Parikh】多任务视觉和语言表示学习

【CVPR2020-加州理工大学Devi Parikh】多任务视觉和语言表示学习

专知会员服务

38+阅读 · 2020年2月25日

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

62+阅读 · 2020年1月10日

ECCV 2022 | 无需下游训练，Tip-Adapter大幅提升CLIP图像分类准确率

ECCV 2022 | 无需下游训练，Tip-Adapter大幅提升CLIP图像分类准确率

机器之心

4+阅读 · 2022年9月25日

NAACL 2022 | 基于Prompt的文本生成迁移学习

NAACL 2022 | 基于Prompt的文本生成迁移学习

PaperWeekly

1+阅读 · 2022年8月31日

ECCV 2022 | 旷视提出半监督目标检测模型Dense Teacher，取得SOTA性能

ECCV 2022 | 旷视提出半监督目标检测模型Dense Teacher，取得SOTA性能

PaperWeekly

0+阅读 · 2022年7月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

迁移学习之Domain Adaptation

迁移学习之Domain Adaptation

全球人工智能

18+阅读 · 2018年4月11日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

A1AR保护糖尿病肾小管周微环境的非管球反馈机制

国家自然科学基金

0+阅读 · 2014年12月31日

CXCR4与IL-35联合基因修饰间充质干细胞对溃疡性结肠炎局部免疫平衡的调节及清热燥湿凉血方的协同作用

国家自然科学基金

0+阅读 · 2014年12月31日

Tau蛋白异常导致学习记忆损害的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

小窝蛋白-3在PKC Eplison 介导缺血预适应心脏保护中的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

非参数变换模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的图模型学习与统计推断

国家自然科学基金

8+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

面向无人驾驶汽车的视觉道路环境感知算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Beyond Invariance: Test-Time Label-Shift Adaptation for Distributions with "Spurious" Correlations

Arxiv

0+阅读 · 2023年5月24日

On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications

Arxiv

0+阅读 · 2023年5月23日

ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models

Arxiv

0+阅读 · 2023年5月23日

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

Arxiv

0+阅读 · 2023年5月22日

AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation

Arxiv

0+阅读 · 2023年5月22日

Text-based Person Search without Parallel Image-Text Data

Arxiv

0+阅读 · 2023年5月22日

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Arxiv

0+阅读 · 2023年5月21日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Transfer Adaptation Learning: A Decade Survey

Transfer Adaptation Learning: A Decade Survey

Arxiv

37+阅读 · 2019年3月12日

VIP会员

文章信息

相关主题

视觉语言模型

相关VIP内容

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

专知会员服务

31+阅读 · 2023年4月7日

【南洋理工-CVPR2022】视觉语言模型的条件提示学习

【南洋理工-CVPR2022】视觉语言模型的条件提示学习

专知会员服务

34+阅读 · 2022年3月13日

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

【CVPR 2022】视觉提示调整（VPT），Vision Prompt Tuning

专知会员服务

32+阅读 · 2022年3月12日

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

专知会员服务

14+阅读 · 2022年3月12日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

专知会员服务

71+阅读 · 2020年4月20日

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

【CVPR2020-Facebook AI】单样本自适应域脸生成，One-Shot Domain Adaptation

专知会员服务

29+阅读 · 2020年4月6日

【CVPR2020-加州理工大学Devi Parikh】多任务视觉和语言表示学习

【CVPR2020-加州理工大学Devi Parikh】多任务视觉和语言表示学习

专知会员服务

38+阅读 · 2020年2月25日

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

62+阅读 · 2020年1月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

ECCV 2022 | 无需下游训练，Tip-Adapter大幅提升CLIP图像分类准确率

ECCV 2022 | 无需下游训练，Tip-Adapter大幅提升CLIP图像分类准确率

机器之心

4+阅读 · 2022年9月25日

NAACL 2022 | 基于Prompt的文本生成迁移学习

NAACL 2022 | 基于Prompt的文本生成迁移学习

PaperWeekly

1+阅读 · 2022年8月31日

ECCV 2022 | 旷视提出半监督目标检测模型Dense Teacher，取得SOTA性能

ECCV 2022 | 旷视提出半监督目标检测模型Dense Teacher，取得SOTA性能

PaperWeekly

0+阅读 · 2022年7月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

迁移学习之Domain Adaptation

迁移学习之Domain Adaptation

全球人工智能

18+阅读 · 2018年4月11日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Beyond Invariance: Test-Time Label-Shift Adaptation for Distributions with "Spurious" Correlations

Arxiv

0+阅读 · 2023年5月24日

On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications

Arxiv

0+阅读 · 2023年5月23日

ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models

Arxiv

0+阅读 · 2023年5月23日

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

Arxiv

0+阅读 · 2023年5月22日

AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation

Arxiv

0+阅读 · 2023年5月22日

Text-based Person Search without Parallel Image-Text Data

Arxiv

0+阅读 · 2023年5月22日

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Arxiv

0+阅读 · 2023年5月21日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Transfer Adaptation Learning: A Decade Survey

Transfer Adaptation Learning: A Decade Survey

Arxiv

37+阅读 · 2019年3月12日

相关基金

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

44+阅读 · 2015年12月31日

A1AR保护糖尿病肾小管周微环境的非管球反馈机制

国家自然科学基金

0+阅读 · 2014年12月31日

CXCR4与IL-35联合基因修饰间充质干细胞对溃疡性结肠炎局部免疫平衡的调节及清热燥湿凉血方的协同作用

国家自然科学基金

0+阅读 · 2014年12月31日

Tau蛋白异常导致学习记忆损害的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

小窝蛋白-3在PKC Eplison 介导缺血预适应心脏保护中的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

非参数变换模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的图模型学习与统计推断

国家自然科学基金

8+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

面向无人驾驶汽车的视觉道路环境感知算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员