目标分工任务以基于计划放松的奖励 (Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks) - 专知论文

会员服务 ·

0

塑造 · Better · Less · 样本 · 优化器 ·

2021 年 7 月 14 日

Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks

翻译：目标分工任务以基于计划放松的奖励

Ingmar Schubert,Ozgur S. Oguz,Marc Toussaint

from arxiv, Published as a conference paper at ICLR 2021

In high-dimensional state spaces, the usefulness of Reinforcement Learning (RL) is limited by the problem of exploration. This issue has been addressed using potential-based reward shaping (PB-RS) previously. In the present work, we introduce Final-Volume-Preserving Reward Shaping (FV-RS). FV-RS relaxes the strict optimality guarantees of PB-RS to a guarantee of preserved long-term behavior. Being less restrictive, FV-RS allows for reward shaping functions that are even better suited for improving the sample efficiency of RL algorithms. In particular, we consider settings in which the agent has access to an approximate plan. Here, we use examples of simulated robotic manipulation tasks to demonstrate that plan-based FV-RS can indeed significantly improve the sample efficiency of RL over plan-based PB-RS.

翻译：在高地国家空间,强化学习的效用受到勘探问题的限制,这个问题已经利用以前基于潜在奖励的形状(PB-RS)来解决,在目前的工作中,我们采用了最终-Volume-Preserve Reward 形状(FV-RS),FV-RS放松了PB-RS的严格最佳保证,以保障保存的长期行为。FV-RS限制较少,允许奖励甚至更适合提高RL算法抽样效率的塑造功能。我们特别考虑到该代理商可以获得近似计划的环境。在这里,我们使用模拟机器人操纵任务的例子来证明,基于计划的FV-RS确实能够大大提高RL超过基于计划的PB-RS的样本效率。

0

相关内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

专知会员服务

44+阅读 · 2020年5月3日

因果关联学习，Causal Relational Learning

因果关联学习，Causal Relational Learning

专知会员服务

185+阅读 · 2020年4月21日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

343+阅读 · 2020年1月27日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

197+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

已删除

清华大学研究生教育

3+阅读 · 2018年6月30日

Termination Analysis Without the Tears

Arxiv

0+阅读 · 2021年9月15日

SCAPE: Learning Stiffness Control from Augmented Position Control Experiences

Arxiv

0+阅读 · 2021年9月14日

safe-control-gym: a Unified Benchmark Suite for Safe Learning-based Control and Reinforcement Learning

Arxiv

0+阅读 · 2021年9月13日

Particle MPC for Uncertain and Learning-Based Control

Arxiv

0+阅读 · 2021年9月13日

Federated Ensemble Model-based Reinforcement Learning

Arxiv

0+阅读 · 2021年9月12日

Cost-Effective Federated Learning in Mobile Edge Networks

Arxiv

0+阅读 · 2021年9月12日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Learning to Speed Up Query Planning in Graph Databases

Arxiv

6+阅读 · 2018年1月21日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

专知会员服务

44+阅读 · 2020年5月3日

因果关联学习，Causal Relational Learning

因果关联学习，Causal Relational Learning

专知会员服务

185+阅读 · 2020年4月21日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

343+阅读 · 2020年1月27日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

197+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《从装备到文化：美陆军技术素养建设启示录》最新报告

人工智能安全治理白皮书（2025）

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

《商用大语言模型的升级风险管理：国家安全运用》

相关资讯

已删除

清华大学研究生教育

3+阅读 · 2018年6月30日

相关论文

Termination Analysis Without the Tears

Arxiv

0+阅读 · 2021年9月15日

SCAPE: Learning Stiffness Control from Augmented Position Control Experiences

Arxiv

0+阅读 · 2021年9月14日

safe-control-gym: a Unified Benchmark Suite for Safe Learning-based Control and Reinforcement Learning

Arxiv

0+阅读 · 2021年9月13日

Particle MPC for Uncertain and Learning-Based Control

Arxiv

0+阅读 · 2021年9月13日

Federated Ensemble Model-based Reinforcement Learning

Arxiv

0+阅读 · 2021年9月12日

Cost-Effective Federated Learning in Mobile Edge Networks

Arxiv

0+阅读 · 2021年9月12日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Learning to Speed Up Query Planning in Graph Databases

Arxiv

6+阅读 · 2018年1月21日

微信扫码咨询专知VIP会员