用于可缩放和可普及决策的蒙面自动编码 (Masked Autoencoding for Scalable and Generalizable Decision Making) - 专知论文

会员服务 ·

0

掩码 · Learning · MoDELS · 自编码器 · 推断 ·

2022 年 11 月 23 日

Masked Autoencoding for Scalable and Generalizable Decision Making

翻译：用于可缩放和可普及决策的蒙面自动编码

Fangchen Liu,Hao Liu,Aditya Grover,Pieter Abbeel

We are interested in learning scalable agents for reinforcement learning that can learn from large-scale, diverse sequential data similar to current large vision and language models. To this end, this paper presents masked decision prediction (MaskDP), a simple and scalable self-supervised pretraining method for reinforcement learning (RL) and behavioral cloning (BC). In our MaskDP approach, we employ a masked autoencoder (MAE) to state-action trajectories, wherein we randomly mask state and action tokens and reconstruct the missing data. By doing so, the model is required to infer masked-out states and actions and extract information about dynamics. We find that masking different proportions of the input sequence significantly helps with learning a better model that generalizes well to multiple downstream tasks. In our empirical study, we find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching, and it can zero-shot infer skills from a few example transitions. In addition, MaskDP transfers well to offline RL and shows promising scaling behavior w.r.t. to model size. It is amenable to data-efficient finetuning, achieving competitive results with prior methods based on autoregressive pretraining.

翻译：我们感兴趣的是学习可扩增学习的可扩缩剂,从类似于当前大型愿景和语言模型的大规模、多样的相继数据中学习。为此,本文件介绍了隐蔽的决定预测(MaskDP),这是一个简单且可扩缩的自我监督的强化学习和行为克隆(BC)培训前方法。在我们的MaskDP方法中,我们使用一个蒙面自动校验器(MAE)到州行动轨迹,其中我们随机遮盖状态和动作符号,并重建缺失的数据。为此,模型需要推断隐藏的状态和行动,并提取动态信息。我们发现,掩盖不同比例的投入序列极大地有助于学习一种更好的模式,能够将多个下游任务综合起来。在我们的实验研究中,我们发现一个蒙面自动校验器模型能够将零发换到新的公元件任务,例如单项和多项目标达到,并且可以从几个例子转换中零发回技能。此外,MaskDP还可以向离线的隐藏状态和动作,并展示有希望的升级行为方式,从而实现具有竞争力的升级前的自我调整。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

H2B泛素化在细胞重编程过程中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

microRNA调控Annexin信号通路在多壁碳纳米管致癌过程中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

溶血磷脂酸受体（LPAR2a/2b/3）在斑马鱼后侧线原基的集体细胞迁移和MET中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

细长体大攻角绕流非对称性形成机理的尺度自适应模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Livin-Fibronectin分子与生物力学信号偶联介导前列腺癌“抵抗-逃离”转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

适应反应基因ATF3调控细胞骨架重构抑制膀胱癌转移的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

基于IPMC的仿生鳐鱼运动机理与控制方法研究

国家自然科学基金

0+阅读 · 2010年12月31日

c-Fos/AP-1促进TRAIL介导的前列腺癌细胞凋亡的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Semantic Segmentation Enhanced Transformer Model for Human Attention Prediction

Arxiv

0+阅读 · 2023年1月26日

Compact Transformer Tracker with Correlative Masked Modeling

Arxiv

0+阅读 · 2023年1月26日

Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text

Arxiv

0+阅读 · 2023年1月26日

Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning

Arxiv

0+阅读 · 2023年1月24日

An Entropy-Based Model for Hierarchical Learning

Arxiv

0+阅读 · 2023年1月24日

The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs

Arxiv

0+阅读 · 2023年1月24日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

IoT Solutions with Multi-Sensor Fusion and Signal-Image Encoding for Secure Data Transfer and Decision Making

Arxiv

37+阅读 · 2021年6月2日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Semantic Segmentation Enhanced Transformer Model for Human Attention Prediction

Arxiv

0+阅读 · 2023年1月26日

Compact Transformer Tracker with Correlative Masked Modeling

Arxiv

0+阅读 · 2023年1月26日

Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text

Arxiv

0+阅读 · 2023年1月26日

Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning

Arxiv

0+阅读 · 2023年1月24日

An Entropy-Based Model for Hierarchical Learning

Arxiv

0+阅读 · 2023年1月24日

The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs

Arxiv

0+阅读 · 2023年1月24日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

IoT Solutions with Multi-Sensor Fusion and Signal-Image Encoding for Secure Data Transfer and Decision Making

Arxiv

37+阅读 · 2021年6月2日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Arxiv

15+阅读 · 2020年2月28日

相关基金

H2B泛素化在细胞重编程过程中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

microRNA调控Annexin信号通路在多壁碳纳米管致癌过程中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

溶血磷脂酸受体（LPAR2a/2b/3）在斑马鱼后侧线原基的集体细胞迁移和MET中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

细长体大攻角绕流非对称性形成机理的尺度自适应模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Livin-Fibronectin分子与生物力学信号偶联介导前列腺癌“抵抗-逃离”转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

适应反应基因ATF3调控细胞骨架重构抑制膀胱癌转移的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

基于IPMC的仿生鳐鱼运动机理与控制方法研究

国家自然科学基金

0+阅读 · 2010年12月31日

c-Fos/AP-1促进TRAIL介导的前列腺癌细胞凋亡的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员