以成果为导向的不确定性和时空远程意识课程制定目标的强化学习 (Outcome-directed Reinforcement Learning by Uncertainty & Temporal Distance-Aware Curriculum Goal Generation) - 专知论文

会员服务 ·

0

Guidance · Learning · 强化学习 · 查准率/准确率 · Better ·

2023 年 2 月 17 日

Outcome-directed Reinforcement Learning by Uncertainty & Temporal Distance-Aware Curriculum Goal Generation

翻译：以成果为导向的不确定性和时空远程意识课程制定目标的强化学习

Daesol Cho,Seungjae Lee,H. Jin Kim

from arxiv, ICLR 2023 Spotlight. First two authors contributed equally

Current reinforcement learning (RL) often suffers when solving a challenging exploration problem where the desired outcomes or high rewards are rarely observed. Even though curriculum RL, a framework that solves complex tasks by proposing a sequence of surrogate tasks, shows reasonable results, most of the previous works still have difficulty in proposing curriculum due to the absence of a mechanism for obtaining calibrated guidance to the desired outcome state without any prior domain knowledge. To alleviate it, we propose an uncertainty & temporal distance-aware curriculum goal generation method for the outcome-directed RL via solving a bipartite matching problem. It could not only provide precisely calibrated guidance of the curriculum to the desired outcome states but also bring much better sample efficiency and geometry-agnostic curriculum goal proposal capability compared to previous curriculum RL methods. We demonstrate that our algorithm significantly outperforms these prior methods in a variety of challenging navigation tasks and robotic manipulation tasks in a quantitative and qualitative way.

翻译：目前的强化学习(RL)在解决一个挑战性的探索问题时往往会受到影响,因为人们很少看到预期的结果或高回报。尽管RL课程是一个通过提出一系列替代任务来解决复杂任务的框架,它显示了合理的结果,但以往的大部分工作仍然难以提出课程,因为缺乏一种机制,在没有事先任何领域知识的情况下获得对预期结果状态的校准指导。为了减轻这一困难,我们建议通过解决双方匹配问题,为成果导向的RL提供不确定性和时间远程课程目标生成方法。它不仅能够向预期结果状态提供准确的校准课程指导,而且能够带来比以前课程方法更好的抽样效率和几何测量学课程目标建议能力。我们证明,我们的算法在各种具有挑战性的导航任务以及定量和定性的机器人操纵任务中,大大超越了这些先前的方法。

0

相关内容

Guidance

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

98+阅读 · 2019年12月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

全球海洋热含量估计中的Mapping方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

层状双金属氧化物表面修饰TiO2增强光催化CO2还原及作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

分数阶微分-代数方程的高精度数值算法

国家自然科学基金

0+阅读 · 2014年12月31日

维生素D受体对Wnt通路的调控在糖尿病肾病蛋白尿及足细胞转分化中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

CUEDC2分子在微囊藻毒素致肝细胞氧化损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

Doublecortin的动态表达在骨折愈合中的作用与调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

人中性粒细胞多肽介导炎性肺损伤的机理

国家自然科学基金

0+阅读 · 2012年12月31日

有机膦小分子催化的活泼共轭二烯的Rauhut-Currier串联反应研究

国家自然科学基金

0+阅读 · 2011年12月31日

细胞膜和线粒体Cx43介导氧化应激与心脏缺血后处理相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

Uncertainty-driven Trajectory Truncation for Model-based Offline Reinforcement Learning

Uncertainty-driven Trajectory Truncation for Model-based Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年4月10日

Comparing NARS and Reinforcement Learning: An Analysis of ONA and $Q$-Learning Algorithms

Arxiv

0+阅读 · 2023年4月10日

CRISP: Curriculum inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning

Arxiv

0+阅读 · 2023年4月7日

Risk-Aware Distributed Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年4月4日

Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics

Arxiv

0+阅读 · 2023年4月3日

Risk-Sensitive and Robust Model-Based Reinforcement Learning and Planning

Arxiv

2+阅读 · 2023年4月2日

Federated Ensemble Model-based Reinforcement Learning in Edge Computing

Arxiv

0+阅读 · 2023年4月1日

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Arxiv

14+阅读 · 2021年12月22日

Curriculum Learning: A Survey

Arxiv

24+阅读 · 2021年1月25日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

98+阅读 · 2019年12月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

视觉-语言-动作模型解析：从模块构成到里程碑与挑战

《解析陆域作战方向：一个概念性框架》报告

【博士论文】基于多模态基础模型的上下文学习

追寻真正的AI自主性：从遗留思维到战场优势

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Uncertainty-driven Trajectory Truncation for Model-based Offline Reinforcement Learning

Uncertainty-driven Trajectory Truncation for Model-based Offline Reinforcement Learning

Arxiv

0+阅读 · 2023年4月10日

Comparing NARS and Reinforcement Learning: An Analysis of ONA and $Q$-Learning Algorithms

Arxiv

0+阅读 · 2023年4月10日

CRISP: Curriculum inducing Primitive Informed Subgoal Prediction for Hierarchical Reinforcement Learning

Arxiv

0+阅读 · 2023年4月7日

Risk-Aware Distributed Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年4月4日

Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics

Arxiv

0+阅读 · 2023年4月3日

Risk-Sensitive and Robust Model-Based Reinforcement Learning and Planning

Arxiv

2+阅读 · 2023年4月2日

Federated Ensemble Model-based Reinforcement Learning in Edge Computing

Arxiv

0+阅读 · 2023年4月1日

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Arxiv

14+阅读 · 2021年12月22日

Curriculum Learning: A Survey

Arxiv

24+阅读 · 2021年1月25日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

相关基金

全球海洋热含量估计中的Mapping方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

层状双金属氧化物表面修饰TiO2增强光催化CO2还原及作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

分数阶微分-代数方程的高精度数值算法

国家自然科学基金

0+阅读 · 2014年12月31日

维生素D受体对Wnt通路的调控在糖尿病肾病蛋白尿及足细胞转分化中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

CUEDC2分子在微囊藻毒素致肝细胞氧化损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

Doublecortin的动态表达在骨折愈合中的作用与调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

人中性粒细胞多肽介导炎性肺损伤的机理

国家自然科学基金

0+阅读 · 2012年12月31日

有机膦小分子催化的活泼共轭二烯的Rauhut-Currier串联反应研究

国家自然科学基金

0+阅读 · 2011年12月31日

细胞膜和线粒体Cx43介导氧化应激与心脏缺血后处理相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员