通过变化式反向强化学习学习,学习多任务可转让奖励 (Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning) - 专知论文

会员服务 ·

0

Learning · 逆强化学习 · INFORMS · 回合 · 奖励函数 ·

2022 年 6 月 19 日

Learning Multi-Task Transferable Rewards via Variational Inverse Reinforcement Learning

翻译：通过变化式反向强化学习学习,学习多任务可转让奖励

Se-Wook Yoo,Seung-Woo Seo

from arxiv, Accepted in ICRA 2022

Many robotic tasks are composed of a lot of temporally correlated sub-tasks in a highly complex environment. It is important to discover situational intentions and proper actions by deliberating on temporal abstractions to solve problems effectively. To understand the intention separated from changing task dynamics, we extend an empowerment-based regularization technique to situations with multiple tasks based on the framework of a generative adversarial network. Under the multitask environments with unknown dynamics, we focus on learning a reward and policy from the unlabeled expert examples. In this study, we define situational empowerment as the maximum of mutual information representing how an action conditioned on both a certain state and sub-task affects the future. Our proposed method derives the variational lower bound of the situational mutual information to optimize it. We simultaneously learn the transferable multi-task reward function and policy by adding an induced term to the objective function. By doing so, the multi-task reward function helps to learn a robust policy for environmental change. We validate the advantages of our approach on multi-task learning and multi-task transfer learning. We demonstrate our proposed method has the robustness of both randomness and changing task dynamics. Finally, we prove that our method has significantly better performance and data efficiency than existing imitation learning methods on various benchmarks.

翻译：许多机器人任务是由在高度复杂的环境中许多与时间相关联的子任务构成的。重要的是要通过思考时间抽象来发现情景意图和适当的行动,从而有效地解决问题。为了理解与变化中的任务动态分离的意图,我们将基于赋权的正规化技术推广到基于基因对抗网络框架的多重任务的情况。在具有未知动态的多任务环境中,我们侧重于从未加标签的专家实例中学习奖励和政策。在这项研究中,我们将状况赋权定义为显示某种行动如何以某种状态和子任务对未来产生影响的相互信息的最大优势。我们建议的方法产生情况共同信息变异的较低范围,以优化它。我们同时学习可转移的多任务奖励功能和政策,在目标功能中添加一个诱导的术语。通过这样做,多任务奖励功能有助于学习强有力的环境变化政策。我们验证了我们关于多任务学习和多任务转移学习的方法的优势。我们提出的方法在随机性和任务动态上都表现出了强健性,在改变任务动态上也存在更强性。最后,我们证明我们采用的数据方法比现有方法更能改进。我们的数据方法。

0

相关内容

Learning

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

平滑肌肌球蛋白磷酸化调节的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

多元稀土六硼化物LaxCe1-xB6微观结构及电子发射机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Erbin介导细胞周期异常与肿瘤发生的关系

国家自然科学基金

0+阅读 · 2012年12月31日

抑制TGF-β1/Smad 信号通路促进骨-肌腱结合部瘢痕愈合后的软骨性重塑

国家自然科学基金

0+阅读 · 2011年12月31日

翻译后修饰CREB-1阻断TGF-β1介导的实验性肝纤维化

国家自然科学基金

0+阅读 · 2011年12月31日

猪繁殖与呼吸综合征病毒调节未折叠蛋白反应的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于液固二次流现象与流固耦合效应的脑动脉瘤生长行为与破裂机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

黄孢原毛平革菌寡肽转运蛋白基因家族研究

国家自然科学基金

0+阅读 · 2008年12月31日

Multi-Task Fusion via Reinforcement Learning for Long-Term User Satisfaction in Recommender Systems

Arxiv

0+阅读 · 2022年8月10日

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Arxiv

0+阅读 · 2022年8月9日

Continual Reinforcement Learning with TELLA

Continual Reinforcement Learning with TELLA

Arxiv

0+阅读 · 2022年8月8日

Sparse Adversarial Attack in Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2022年8月8日

A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning

Arxiv

0+阅读 · 2022年8月5日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

逆强化学习

相关VIP内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

Multi-Task Fusion via Reinforcement Learning for Long-Term User Satisfaction in Recommender Systems

Arxiv

0+阅读 · 2022年8月10日

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Arxiv

0+阅读 · 2022年8月9日

Continual Reinforcement Learning with TELLA

Continual Reinforcement Learning with TELLA

Arxiv

0+阅读 · 2022年8月8日

Sparse Adversarial Attack in Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2022年8月8日

A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning

Arxiv

0+阅读 · 2022年8月5日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

平滑肌肌球蛋白磷酸化调节的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

多元稀土六硼化物LaxCe1-xB6微观结构及电子发射机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Erbin介导细胞周期异常与肿瘤发生的关系

国家自然科学基金

0+阅读 · 2012年12月31日

抑制TGF-β1/Smad 信号通路促进骨-肌腱结合部瘢痕愈合后的软骨性重塑

国家自然科学基金

0+阅读 · 2011年12月31日

翻译后修饰CREB-1阻断TGF-β1介导的实验性肝纤维化

国家自然科学基金

0+阅读 · 2011年12月31日

猪繁殖与呼吸综合征病毒调节未折叠蛋白反应的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于液固二次流现象与流固耦合效应的脑动脉瘤生长行为与破裂机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

黄孢原毛平革菌寡肽转运蛋白基因家族研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员