VPT:通过观看未贴标签的在线视频学习行动 (Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos) - 专知论文

会员服务 ·

0

Learning · 未标记 · Agent · 在线 · 标注 ·

2022 年 6 月 23 日

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

翻译：VPT:通过观看未贴标签的在线视频学习行动

Bowen Baker,Ilge Akkaya,Peter Zhokhov,Joost Huizinga,Jie Tang,Adrien Ecoffet,Brandon Houghton,Raul Sampedro,Jeff Clune

Pretraining on noisy, internet-scale datasets has been heavily studied as a technique for training models with broad, general capabilities for text, images, and other modalities. However, for many sequential decision domains such as robotics, video games, and computer use, publicly available data does not contain the labels required to train behavioral priors in the same way. We extend the internet-scale pretraining paradigm to sequential decision domains through semi-supervised imitation learning wherein agents learn to act by watching online unlabeled videos. Specifically, we show that with a small amount of labeled data we can train an inverse dynamics model accurate enough to label a huge unlabeled source of online data -- here, online videos of people playing Minecraft -- from which we can then train a general behavioral prior. Despite using the native human interface (mouse and keyboard at 20Hz), we show that this behavioral prior has nontrivial zero-shot capabilities and that it can be fine-tuned, with both imitation learning and reinforcement learning, to hard-exploration tasks that are impossible to learn from scratch via reinforcement learning. For many tasks our models exhibit human-level performance, and we are the first to report computer agents that can craft diamond tools, which can take proficient humans upwards of 20 minutes (24,000 environment actions) of gameplay to accomplish.

翻译：有关噪音、互联网规模数据集的预培训已经作为培训模型的一种技术进行了大量研究。但是,对于许多连续决策领域,如机器人、视频游戏和计算机使用等,公开数据并不包含以同样方式培训行为前科所需的标签。我们通过半监督的模拟学习,将互联网规模预培训模式扩大到连续决策领域,使代理商学会通过观看在线未贴标签的视频来采取行动。具体地说,我们显示,只要有少量贴标签的数据,我们就能训练一个反动态模型,准确到可以标出一个巨大的未贴标签的在线数据来源 -- -- 这里,是玩Minecraft(Minecraft)的人的在线视频 -- -- 之后,我们可以从中先训练一般行为。尽管我们使用了本地人类界面(20Hz的移动和键盘),但我们显示,这一行为前没有微的零弹能力,而且可以通过模仿学习和强化学习来进行精确调整,到硬爆破任务,我们无法通过强化学习从零抓学来学习。对于许多模型展示人类游戏工具的任务来说,我们可以通过20分钟的钻钻机操作进行。

0

相关内容

Learning

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

富锂锰基正极材料表面改性、结构稳定性及电化学行为的研究

国家自然科学基金

0+阅读 · 2014年12月31日

碳量子点/MoSe2纳米复合材料的构筑及其光催化性能

国家自然科学基金

0+阅读 · 2013年12月31日

新型HER2抗体TPC对HER2阳性Trastuzumab耐受型乳腺癌的杀伤作用及分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

随机泛函微分方程的动力学性态

国家自然科学基金

0+阅读 · 2012年12月31日

随机泛函微分方程的渐近行为

国家自然科学基金

0+阅读 · 2012年12月31日

功能化石墨烯材料对放射性核素吸附及其机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

AFP特异性TCR转基因T细胞治疗肝癌的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

多发性硬化抗原特异性CD4+CD25+调节性T细胞的体外扩增及其治疗潜能的研究

国家自然科学基金

0+阅读 · 2011年12月31日

随机偏泛函微分系统的可控性

国家自然科学基金

0+阅读 · 2008年12月31日

Unsupervised Face Morphing Attack Detection via Self-paced Anomaly Detection

Unsupervised Face Morphing Attack Detection via Self-paced Anomaly Detection

Arxiv

0+阅读 · 2022年8月11日

When costly migration helps to improve cooperation

Arxiv

0+阅读 · 2022年8月10日

EXTERN: Leveraging Endo-Temporal Regularization for Black-box Video Domain Adaptation

Arxiv

0+阅读 · 2022年8月10日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

22+阅读 · 2021年9月23日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

Matching Networks for One Shot Learning

Arxiv

10+阅读 · 2017年12月29日

Multimodal Machine Learning: A Survey and Taxonomy

Arxiv

151+阅读 · 2017年8月1日

VIP会员

文章信息

相关主题

相关VIP内容

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《面相未来部队设计的兵棋推演：解锁过程中的作战艺术》

《模拟空域：释放人工智能实现自适应空中防御》2025年最新文献

《迈向真正的机器人队友：推断与运用认知状态以实现新型人类-自主系统协作能力》最新博士论文

《面向开放式兵棋推演的语言模型》2025最新文献

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Unsupervised Face Morphing Attack Detection via Self-paced Anomaly Detection

Unsupervised Face Morphing Attack Detection via Self-paced Anomaly Detection

Arxiv

0+阅读 · 2022年8月11日

When costly migration helps to improve cooperation

Arxiv

0+阅读 · 2022年8月10日

EXTERN: Leveraging Endo-Temporal Regularization for Black-box Video Domain Adaptation

Arxiv

0+阅读 · 2022年8月10日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

22+阅读 · 2021年9月23日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

Matching Networks for One Shot Learning

Arxiv

10+阅读 · 2017年12月29日

Multimodal Machine Learning: A Survey and Taxonomy

Arxiv

151+阅读 · 2017年8月1日

相关基金

富锂锰基正极材料表面改性、结构稳定性及电化学行为的研究

国家自然科学基金

0+阅读 · 2014年12月31日

碳量子点/MoSe2纳米复合材料的构筑及其光催化性能

国家自然科学基金

0+阅读 · 2013年12月31日

新型HER2抗体TPC对HER2阳性Trastuzumab耐受型乳腺癌的杀伤作用及分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

随机泛函微分方程的动力学性态

国家自然科学基金

0+阅读 · 2012年12月31日

随机泛函微分方程的渐近行为

国家自然科学基金

0+阅读 · 2012年12月31日

功能化石墨烯材料对放射性核素吸附及其机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

AFP特异性TCR转基因T细胞治疗肝癌的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

多发性硬化抗原特异性CD4+CD25+调节性T细胞的体外扩增及其治疗潜能的研究

国家自然科学基金

0+阅读 · 2011年12月31日

随机偏泛函微分系统的可控性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员