预训练视觉模型对机器人操作的无损调适 (Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation) - 专知论文

会员服务 ·

0

预训练 · 微调 · 机器人 · 视觉模型 · 操作 ·

2023 年 4 月 13 日

Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation

翻译：预训练视觉模型对机器人操作的无损调适

Mohit Sharma,Claudio Fantacci,Yuxiang Zhou,Skanda Koppula,Nicolas Heess,Jon Scholz,Yusuf Aytar

from arxiv, ICLR'23, Project page see https://sites.google.com/view/robo-adapters/

Recent works have shown that large models pretrained on common visual learning tasks can provide useful representations for a wide range of specialized perception problems, as well as a variety of robotic manipulation tasks. While prior work on robotic manipulation has predominantly used frozen pretrained features, we demonstrate that in robotics this approach can fail to reach optimal performance, and that fine-tuning of the full model can lead to significantly better results. Unfortunately, fine-tuning disrupts the pretrained visual representation, and causes representational drift towards the fine-tuned task thus leading to a loss of the versatility of the original model. We introduce "lossless adaptation" to address this shortcoming of classical fine-tuning. We demonstrate that appropriate placement of our parameter efficient adapters can significantly reduce the performance gap between frozen pretrained representations and full end-to-end fine-tuning without changes to the original representation and thus preserving original capabilities of the pretrained model. We perform a comprehensive investigation across three major model architectures (ViTs, NFNets, and ResNets), supervised (ImageNet-1K classification) and self-supervised pretrained weights (CLIP, BYOL, Visual MAE) in 3 task domains and 35 individual tasks, and demonstrate that our claims are strongly validated in various settings.

翻译：近期的研究表明，预训练的大型模型在常见的视觉学习任务中提供的有用表示可以应用于各种专业感知问题以及各种机器人操作任务。虽然先前研究主要使用冻结的预训练特征进行机器人操作，但我们证明在机器人领域，这种方法可能无法达到最优性能，并且精调模型的整体模型可以导致显着更好的结果。不幸的是，微调会破坏预训练的视觉表示，并导致表示向微调任务漂移，从而丢失预训练模型的多功能性。我们引入“无损调适”来解决经典微调的这种缺陷。我们演示了适当放置我们的参数效率适配器可以显着减少冻结的预训练表示和全端到端微调的性能差距，并保留原始模型的原始表示，从而保留预训练模型的原始功能。我们在三个主要模型体系结构（ViTs、NFNets 和 ResNets）、Supervised(ImageNet-1K 分类) 和 self-supervised pretrained weights (CLIP、BYOL 和 Visual MAE)在3个任务领域和35个单独任务上进行了全面调查，并证明了我们的主张在各种设置中都得到了强有力的验证。

0

相关内容

预训练

在搭建网络模型时，需要随机初始化参数，然后开始训练网络，不断调整直到网络的损失越来越小。在训练的过程中，一开始初始化的参数会不断变化。当参数训练到比较好的时候就可以将训练模型的参数保存下来，以便训练好的模型可以在下次执行类似任务时获得较好的结果。

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

专知会员服务

71+阅读 · 2020年4月20日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

专知会员服务

32+阅读 · 2020年2月21日

【斯坦福大学】领域自适应小样本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

【斯坦福大学】领域自适应小样本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

专知会员服务

36+阅读 · 2020年1月7日

【Google论文强烈推荐】ALBERT:基于精简BERT的自我监督学习的语言表示，ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations

【Google论文强烈推荐】ALBERT:基于精简BERT的自我监督学习的语言表示，ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations

专知会员服务

24+阅读 · 2019年12月21日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

IJCAI 2022 | 使用陈述句进行视觉问答的Prompt Tuning

IJCAI 2022 | 使用陈述句进行视觉问答的Prompt Tuning

PaperWeekly

3+阅读 · 2022年9月21日

论文浅尝 | KM-BART：用于视觉常识生成的知识增强多模态BART

论文浅尝 | KM-BART：用于视觉常识生成的知识增强多模态BART

开放知识图谱

0+阅读 · 2022年5月29日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

AI研习社

15+阅读 · 2019年5月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

回声干扰抑制中的自适应信号处理算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

负自旋轨道耦合拓扑绝缘体的第一性原理研究

国家自然科学基金

0+阅读 · 2015年12月31日

酪蛋白激酶1ε在阿尔茨海默病tau病理改变中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

基于Split Bregman方法的全局凸快速图像分割模型的研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于PCA与二代Curvelet变换的多模态医学图像融合方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

慢性高眼压视网膜Müer细胞激活对神经节细胞损伤的作用及机制探讨

国家自然科学基金

0+阅读 · 2012年12月31日

混合价态四氧化三锰微纳米结构生长过程及性质的掺杂效应研究

国家自然科学基金

0+阅读 · 2011年12月31日

小鼠主要嗅觉表皮内受AC3调控的差异表达基因筛选与鉴定

国家自然科学基金

0+阅读 · 2011年12月31日

新的核膜定位分子TRAF3IP3促进细胞增殖的机制研究及其在血液系统细胞中的功能探索

国家自然科学基金

0+阅读 · 2011年12月31日

单手螺旋有机无机杂化二氧化硅纳米纤维的制备及其在手性拆分领域的应用

国家自然科学基金

0+阅读 · 2008年12月31日

Joint Adaptive Representations for Image-Language Learning

Arxiv

0+阅读 · 2023年5月31日

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models

Arxiv

0+阅读 · 2023年5月31日

Language-Conditioned Imitation Learning with Base Skill Priors under Unstructured Data

Arxiv

0+阅读 · 2023年5月30日

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

Arxiv

0+阅读 · 2023年5月30日

VIMA: General Robot Manipulation with Multimodal Prompts

Arxiv

0+阅读 · 2023年5月28日

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

Arxiv

1+阅读 · 2023年5月26日

Three Towers: Flexible Contrastive Learning with Pretrained Image Models

Arxiv

0+阅读 · 2023年5月26日

Visually-augmented pretrained language models for NLP tasks without images

Arxiv

0+阅读 · 2023年5月26日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Arxiv

10+阅读 · 2018年4月29日

VIP会员

文章信息

相关主题

相关VIP内容

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

【ACL2020】不要停止预训练:根据领域和任务自适应调整语言模型，Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

专知会员服务

46+阅读 · 2020年4月25日

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

专知会员服务

71+阅读 · 2020年4月20日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

【微软雷德蒙研究院】小样本自然语言生成，Few-shot Natural Language Generation for Task-Oriented Dialog

专知会员服务

33+阅读 · 2020年2月29日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

【微软亚洲研究院】CodeBERT:用于编程和自然语言的预训练模型，CodeBERT: A Pre-Trained Model for Programming and Natural Languages

专知会员服务

32+阅读 · 2020年2月21日

【斯坦福大学】领域自适应小样本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

【斯坦福大学】领域自适应小样本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

专知会员服务

36+阅读 · 2020年1月7日

【Google论文强烈推荐】ALBERT:基于精简BERT的自我监督学习的语言表示，ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations

【Google论文强烈推荐】ALBERT:基于精简BERT的自我监督学习的语言表示，ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations

专知会员服务

24+阅读 · 2019年12月21日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】以奖励推动生成式人工智能的发展：奖励引导生成的理论与方法

中文版 | 火力支援与巡飞弹药的未来（附原文）

中文版 | 人工智能时代的任务式指挥

扩散模型中的 Transformer：图像生成及其延展应用询问 ChatGPT

相关资讯

IJCAI 2022 | 使用陈述句进行视觉问答的Prompt Tuning

IJCAI 2022 | 使用陈述句进行视觉问答的Prompt Tuning

PaperWeekly

3+阅读 · 2022年9月21日

论文浅尝 | KM-BART：用于视觉常识生成的知识增强多模态BART

论文浅尝 | KM-BART：用于视觉常识生成的知识增强多模态BART

开放知识图谱

0+阅读 · 2022年5月29日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

AI研习社

15+阅读 · 2019年5月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

相关论文

Joint Adaptive Representations for Image-Language Learning

Arxiv

0+阅读 · 2023年5月31日

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models

Arxiv

0+阅读 · 2023年5月31日

Language-Conditioned Imitation Learning with Base Skill Priors under Unstructured Data

Arxiv

0+阅读 · 2023年5月30日

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

Arxiv

0+阅读 · 2023年5月30日

VIMA: General Robot Manipulation with Multimodal Prompts

Arxiv

0+阅读 · 2023年5月28日

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

Arxiv

1+阅读 · 2023年5月26日

Three Towers: Flexible Contrastive Learning with Pretrained Image Models

Arxiv

0+阅读 · 2023年5月26日

Visually-augmented pretrained language models for NLP tasks without images

Arxiv

0+阅读 · 2023年5月26日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Arxiv

10+阅读 · 2018年4月29日

相关基金

回声干扰抑制中的自适应信号处理算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

负自旋轨道耦合拓扑绝缘体的第一性原理研究

国家自然科学基金

0+阅读 · 2015年12月31日

酪蛋白激酶1ε在阿尔茨海默病tau病理改变中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

基于Split Bregman方法的全局凸快速图像分割模型的研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于PCA与二代Curvelet变换的多模态医学图像融合方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

慢性高眼压视网膜Müer细胞激活对神经节细胞损伤的作用及机制探讨

国家自然科学基金

0+阅读 · 2012年12月31日

混合价态四氧化三锰微纳米结构生长过程及性质的掺杂效应研究

国家自然科学基金

0+阅读 · 2011年12月31日

小鼠主要嗅觉表皮内受AC3调控的差异表达基因筛选与鉴定

国家自然科学基金

0+阅读 · 2011年12月31日

新的核膜定位分子TRAF3IP3促进细胞增殖的机制研究及其在血液系统细胞中的功能探索

国家自然科学基金

0+阅读 · 2011年12月31日

单手螺旋有机无机杂化二氧化硅纳米纤维的制备及其在手性拆分领域的应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员