基于技能的基于技能的基于模式的强化学习 (Skill-based Model-based Reinforcement Learning) - 专知论文

会员服务 ·

0

Learning · MoDELS · 强化学习 · HTTPS · 长期规划 ·

2022 年 12 月 11 日

Skill-based Model-based Reinforcement Learning

翻译：基于技能的基于技能的基于模式的强化学习

Lucy Xiaoyang Shi,Joseph J. Lim,Youngwoon Lee

from arxiv, Published at the Conference on Robot Learning (CoRL) 2022. Website: https://clvrai.com/skimo

Model-based reinforcement learning (RL) is a sample-efficient way of learning complex behaviors by leveraging a learned single-step dynamics model to plan actions in imagination. However, planning every action for long-horizon tasks is not practical, akin to a human planning out every muscle movement. Instead, humans efficiently plan with high-level skills to solve complex tasks. From this intuition, we propose a Skill-based Model-based RL framework (SkiMo) that enables planning in the skill space using a skill dynamics model, which directly predicts the skill outcomes, rather than predicting all small details in the intermediate states, step by step. For accurate and efficient long-term planning, we jointly learn the skill dynamics model and a skill repertoire from prior experience. We then harness the learned skill dynamics model to accurately simulate and plan over long horizons in the skill space, which enables efficient downstream learning of long-horizon, sparse reward tasks. Experimental results in navigation and manipulation domains show that SkiMo extends the temporal horizon of model-based approaches and improves the sample efficiency for both model-based RL and skill-based RL. Code and videos are available at https://clvrai.com/skimo

翻译：以模型为基础的强化学习(RL)是一种通过利用学习的单步动态模型来规划想象中的行动来学习复杂行为的样本高效方法。然而,规划每次长期横向任务的行动并不实用,而类似于人类规划每次肌肉运动。相反,人类以高技能高效计划解决复杂任务。我们从这一直觉中建议了一个基于技能的模型RL框架(SkiMo),它能够利用一种技能动态模型在技能空间进行规划,这种模型直接预测技能成果,而不是一步步预测中间各州的所有小细节。为了准确和高效的长期规划,我们共同学习技能动态模型,并从以往的经验中重新学习技能。然后我们利用学习的技能动态模型,在技能空间进行准确的模拟和规划,从而能够高效率地在下游学习长程、稀薄的奖励任务。导航和操作领域的实验结果显示,SkiMo扩大了基于模型的方法的时间范围,提高了基于模型的方法和基于技能的Rski/L代码的样本效率。可在 httpscl/clima 上找到的图像。

0

相关内容

Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【NeurIPS2019演讲】伯克利Pieter Abbeel，通过元强化学习实现更好的基于模型的RL(Better Model-based RL through Meta RL)

【NeurIPS2019演讲】伯克利Pieter Abbeel，通过元强化学习实现更好的基于模型的RL(Better Model-based RL through Meta RL)

专知会员服务

33+阅读 · 2019年12月13日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

【CVPR 2019 | tutorial】计算机视觉的深度强化学习：Deep Reinforcement Learning for Computer Vision

【CVPR 2019 | tutorial】计算机视觉的深度强化学习：Deep Reinforcement Learning for Computer Vision

专知会员服务

55+阅读 · 2019年11月28日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

超灵敏129Xe磁共振分子影像学新方法---化学交换反转转移

国家自然科学基金

0+阅读 · 2014年12月31日

主客体结构刺激响应型荧光材料的组装与性能调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

结肠癌细胞EMT后通过CD155信号激活NK细胞杀伤效应的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于实时变形测量的微型柔性扑翼气动机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

有机分子器件动力学输运性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

超导磁约束聚变装置用陶瓷基绝缘子材料及低温性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

开壳层有机金属配合物活化小分子的研究

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

金属配合物官能化LDH纳米层片多功能催化材料的制备及性能

国家自然科学基金

0+阅读 · 2009年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

Guiding Pretraining in Reinforcement Learning with Large Language Models

Arxiv

0+阅读 · 2023年2月13日

Aerial View Goal Localization with Reinforcement Learning

Arxiv

0+阅读 · 2023年2月10日

Deep Reinforcement Learning-based Energy Efficiency Optimization For Flying LoRa Gateways

Arxiv

0+阅读 · 2023年2月10日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Transformers are Meta-Reinforcement Learners

Arxiv

15+阅读 · 2022年6月14日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

91+阅读 · 2022年1月14日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【NeurIPS2019演讲】伯克利Pieter Abbeel，通过元强化学习实现更好的基于模型的RL(Better Model-based RL through Meta RL)

【NeurIPS2019演讲】伯克利Pieter Abbeel，通过元强化学习实现更好的基于模型的RL(Better Model-based RL through Meta RL)

专知会员服务

33+阅读 · 2019年12月13日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

【CVPR 2019 | tutorial】计算机视觉的深度强化学习：Deep Reinforcement Learning for Computer Vision

【CVPR 2019 | tutorial】计算机视觉的深度强化学习：Deep Reinforcement Learning for Computer Vision

专知会员服务

55+阅读 · 2019年11月28日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Guiding Pretraining in Reinforcement Learning with Large Language Models

Arxiv

0+阅读 · 2023年2月13日

Aerial View Goal Localization with Reinforcement Learning

Arxiv

0+阅读 · 2023年2月10日

Deep Reinforcement Learning-based Energy Efficiency Optimization For Flying LoRa Gateways

Arxiv

0+阅读 · 2023年2月10日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Transformers are Meta-Reinforcement Learners

Arxiv

15+阅读 · 2022年6月14日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

91+阅读 · 2022年1月14日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

超灵敏129Xe磁共振分子影像学新方法---化学交换反转转移

国家自然科学基金

0+阅读 · 2014年12月31日

主客体结构刺激响应型荧光材料的组装与性能调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

结肠癌细胞EMT后通过CD155信号激活NK细胞杀伤效应的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于实时变形测量的微型柔性扑翼气动机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

有机分子器件动力学输运性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

超导磁约束聚变装置用陶瓷基绝缘子材料及低温性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

开壳层有机金属配合物活化小分子的研究

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

金属配合物官能化LDH纳米层片多功能催化材料的制备及性能

国家自然科学基金

0+阅读 · 2009年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员