PI-QT-PI-QT-Opt: 预报信息改进大规模多任务机器人加强学习 (PI-QT-Opt: Predictive Information Improves Multi-Task Robotic Reinforcement Learning at Scale) - 专知论文

会员服务 ·

0

INFORMS · Agent · Learning · 情景 · 机器人 ·

2022 年 10 月 15 日

PI-QT-Opt: Predictive Information Improves Multi-Task Robotic Reinforcement Learning at Scale

翻译：PI-QT-PI-QT-Opt: 预报信息改进大规模多任务机器人加强学习

Kuang-Huei Lee,Ted Xiao,Adrian Li,Paul Wohlhart,Ian Fischer,Yao Lu

from arxiv, CoRL 2022. 21 pages, 9 figures. The supplementary video is available at https://kuanghuei.github.io/piqtopt

The predictive information, the mutual information between the past and future, has been shown to be a useful representation learning auxiliary loss for training reinforcement learning agents, as the ability to model what will happen next is critical to success on many control tasks. While existing studies are largely restricted to training specialist agents on single-task settings in simulation, in this work, we study modeling the predictive information for robotic agents and its importance for general-purpose agents that are trained to master a large repertoire of diverse skills from large amounts of data. Specifically, we introduce Predictive Information QT-Opt (PI-QT-Opt), a QT-Opt agent augmented with an auxiliary loss that learns representations of the predictive information to solve up to 297 vision-based robot manipulation tasks in simulation and the real world with a single set of parameters. We demonstrate that modeling the predictive information significantly improves success rates on the training tasks and leads to better zero-shot transfer to unseen novel tasks. Finally, we evaluate PI-QT-Opt on real robots, achieving substantial and consistent improvement over QT-Opt in multiple experimental settings of varying environments, skills, and multi-task configurations.

翻译：预测性信息,即过去与未来之间的相互信息,被证明是培训强化学习代理人的一种有用的代表学习辅助损失,因为模拟下一步将发生的事情的能力对于许多控制任务的成功至关重要。虽然现有的研究主要限于在模拟中就单一任务设置对专家代理人进行模拟培训,但在这项工作中,我们研究机器人代理人的预测性信息及其对于受过培训以掌握大量大量数据所产生不同技能的大批普通用途代理人的重要性的模型。具体地说,我们引进了预测性信息QT-Opt(PI-QT-Opt),一个QT-OPpt(PI-QT-Opt),一个配有辅助性损失的QT-Opt代理,它学习了预测性信息的表现形式,以便在模拟中用一套参数解决多达297项基于愿景的机器人操纵任务,在现实世界中用一套参数解决。我们证明,预测性信息在培训任务上大大提高了成功率,并导致更好地零向看不见的新任务转移。最后,我们对真实机器人的PI-QT-Opt(PI-Q-Opt)进行实质性改进,在多种实验环境中实现QT-Opt配置、技能和多塔式的实质性和多塔式的大幅改进。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Progerin/PrelaminA诱发早老症的蛋白质组学研究

国家自然科学基金

1+阅读 · 2015年12月31日

Ru催化双导向基团参与C-H键活化及官能化反应的研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

可用于应力探测的无机有机杂化微腔激光介质

国家自然科学基金

1+阅读 · 2012年12月31日

氧化磷酸化在细胞周期调控中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

CK2介导NF-kB信号通路在前列腺癌细胞增殖及凋亡作用中的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

黄连及小檗碱的抗菌活性与细胞毒性之间的相关性研究

国家自然科学基金

0+阅读 · 2011年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

C-末端切割对E2F2功能及神经元凋亡的调控

国家自然科学基金

0+阅读 · 2009年12月31日

二苯乙烯苷对氧化应激诱导的内皮细胞凋亡的影响

国家自然科学基金

0+阅读 · 2008年12月31日

Model-based Trajectory Stitching for Improved Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年11月21日

PartAL: Efficient Partial Active Learning in Multi-Task Visual Settings

Arxiv

0+阅读 · 2022年11月21日

Backdoor Attacks on Multiagent Collaborative Systems

Arxiv

0+阅读 · 2022年11月21日

GUDN: A novel guide network with label reinforcement strategy for extreme multi-label text classification

Arxiv

0+阅读 · 2022年11月21日

CASPR: Customer Activity Sequence-based Prediction and Representation

Arxiv

0+阅读 · 2022年11月21日

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders

Arxiv

0+阅读 · 2022年11月20日

EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

Arxiv

0+阅读 · 2022年11月17日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

相关论文

Model-based Trajectory Stitching for Improved Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年11月21日

PartAL: Efficient Partial Active Learning in Multi-Task Visual Settings

Arxiv

0+阅读 · 2022年11月21日

Backdoor Attacks on Multiagent Collaborative Systems

Arxiv

0+阅读 · 2022年11月21日

GUDN: A novel guide network with label reinforcement strategy for extreme multi-label text classification

Arxiv

0+阅读 · 2022年11月21日

CASPR: Customer Activity Sequence-based Prediction and Representation

Arxiv

0+阅读 · 2022年11月21日

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders

Arxiv

0+阅读 · 2022年11月20日

EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

Arxiv

0+阅读 · 2022年11月17日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

Progerin/PrelaminA诱发早老症的蛋白质组学研究

国家自然科学基金

1+阅读 · 2015年12月31日

Ru催化双导向基团参与C-H键活化及官能化反应的研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

可用于应力探测的无机有机杂化微腔激光介质

国家自然科学基金

1+阅读 · 2012年12月31日

氧化磷酸化在细胞周期调控中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

CK2介导NF-kB信号通路在前列腺癌细胞增殖及凋亡作用中的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

黄连及小檗碱的抗菌活性与细胞毒性之间的相关性研究

国家自然科学基金

0+阅读 · 2011年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

C-末端切割对E2F2功能及神经元凋亡的调控

国家自然科学基金

0+阅读 · 2009年12月31日

二苯乙烯苷对氧化应激诱导的内皮细胞凋亡的影响

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员