学习设置政策:生活行为之间的可靠过渡 (Learning Setup Policies: Reliable Transition Between Locomotion Behaviours) - 专知论文

会员服务 ·

0

Learning · 策略改进 · Performer · 控制器 · 多样性 ·

2022 年 10 月 6 日

Learning Setup Policies: Reliable Transition Between Locomotion Behaviours

翻译：学习设置政策:生活行为之间的可靠过渡

Brendan Tidd,Nicolas Hudson,Akansel Cosgun,Jurgen Leitner

from arxiv, Published in IEEE Robotics and Automation Letters ( Volume: 7, Issue: 4, October 2022) Page(s): 11958 - 11965 https://ieeexplore.ieee.org/document/9894663

Dynamic platforms that operate over many unique terrain conditions typically require many behaviours. To transition safely, there must be an overlap of states between adjacent controllers. We develop a novel method for training setup policies that bridge the trajectories between pre-trained Deep Reinforcement Learning (DRL) policies. We demonstrate our method with a simulated biped traversing a difficult jump terrain, where a single policy fails to learn the task, and switching between pre-trained policies without setup policies also fails. We perform an ablation of key components of our system, and show that our method outperforms others that learn transition policies. We demonstrate our method with several difficult and diverse terrain types, and show that we can use setup policies as part of a modular control suite to successfully traverse a sequence of complex terrains. We show that using setup policies improves the success rate for traversing a single difficult jump terrain (from 51.3% success rate with the best comparative method to 82.2%), and traversing a random sequence of difficult obstacles (from 1.9% without setup policies to 71.2%).

翻译：在许多独特的地形条件下运行的动态平台通常需要许多行为。要安全地过渡, 相邻控制器之间必须存在国家重叠。我们开发了一种新的培训设置政策方法, 将经过训练的深强化学习( DRL) 政策之间的轨迹连接起来。我们用模拟双曲跳跃地形展示了我们的方法, 单项政策无法学习任务, 未经制定政策而将预先训练的政策转换为之间也失败了。我们将系统的关键组成部分进行整合, 并显示我们的方法优于学习过渡政策的其他方。我们用几种困难和多样的地形类型展示了我们的方法。我们展示了可以使用设置政策作为模块控制组合的一部分来成功穿越一系列复杂地形。我们显示, 使用设置政策可以提高单项艰难的跳跃地形的成功率( 从最佳比较方法的51.3%成功率到82.2%), 以及随机设置一系列困难障碍( 从1.9%没有制定政策到71.2%) 。

0

相关内容

Learning

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

51+阅读 · 2022年10月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

母-胎界面RANKL调节滋养细胞生物学行为的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

重金属离子胁迫下花斑裸鲤钙调蛋白磷酸酶(Calcineurin)的应答及其分子调节机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

多元非自治系统中的高阶矢量半有理多怪波的动力学性质及怪波管理

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

利用FOX hunting system解析油菜耐旱的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

纤维基多相态色素掺杂溶胶泡沫负载成膜及调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

低层错能镍基变形高温合金反常动态应变时效机理

国家自然科学基金

0+阅读 · 2011年12月31日

永磁同步电机非线性自适应逆解耦控制系统研究

国家自然科学基金

0+阅读 · 2011年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

Analysis of a Learning Based Algorithm for Budget Pacing

Analysis of a Learning Based Algorithm for Budget Pacing

Arxiv

0+阅读 · 2022年11月11日

A Graph-Based Approach to Generate Energy-Optimal Robot Trajectories in Polygonal Environments

Arxiv

0+阅读 · 2022年11月11日

Looking for Out-of-Distribution Environments in Multi-center Critical Care Data

Arxiv

0+阅读 · 2022年11月11日

Intrinsically-Motivated Goal-Conditioned Reinforcement Learning in Multi-Agent Environments

Arxiv

0+阅读 · 2022年11月11日

Policy learning with asymmetric utilities

Arxiv

0+阅读 · 2022年11月10日

Cluster-Based Control of Transition-Independent MDPs

Cluster-Based Control of Transition-Independent MDPs

Arxiv

0+阅读 · 2022年11月10日

Leveraging Fully Observable Policies for Learning under Partial Observability

Leveraging Fully Observable Policies for Learning under Partial Observability

Arxiv

0+阅读 · 2022年11月10日

ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward

Arxiv

0+阅读 · 2022年11月9日

Debiased Self-Training for Semi-Supervised Learning

Arxiv

0+阅读 · 2022年11月9日

Leveraging Sequentiality in Reinforcement Learning from a Single Demonstration

Leveraging Sequentiality in Reinforcement Learning from a Single Demonstration

Arxiv

0+阅读 · 2022年11月9日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

51+阅读 · 2022年10月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Analysis of a Learning Based Algorithm for Budget Pacing

Analysis of a Learning Based Algorithm for Budget Pacing

Arxiv

0+阅读 · 2022年11月11日

A Graph-Based Approach to Generate Energy-Optimal Robot Trajectories in Polygonal Environments

Arxiv

0+阅读 · 2022年11月11日

Looking for Out-of-Distribution Environments in Multi-center Critical Care Data

Arxiv

0+阅读 · 2022年11月11日

Intrinsically-Motivated Goal-Conditioned Reinforcement Learning in Multi-Agent Environments

Arxiv

0+阅读 · 2022年11月11日

Policy learning with asymmetric utilities

Arxiv

0+阅读 · 2022年11月10日

Cluster-Based Control of Transition-Independent MDPs

Cluster-Based Control of Transition-Independent MDPs

Arxiv

0+阅读 · 2022年11月10日

Leveraging Fully Observable Policies for Learning under Partial Observability

Leveraging Fully Observable Policies for Learning under Partial Observability

Arxiv

0+阅读 · 2022年11月10日

ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward

Arxiv

0+阅读 · 2022年11月9日

Debiased Self-Training for Semi-Supervised Learning

Arxiv

0+阅读 · 2022年11月9日

Leveraging Sequentiality in Reinforcement Learning from a Single Demonstration

Leveraging Sequentiality in Reinforcement Learning from a Single Demonstration

Arxiv

0+阅读 · 2022年11月9日

相关基金

母-胎界面RANKL调节滋养细胞生物学行为的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

重金属离子胁迫下花斑裸鲤钙调蛋白磷酸酶(Calcineurin)的应答及其分子调节机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

多元非自治系统中的高阶矢量半有理多怪波的动力学性质及怪波管理

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

利用FOX hunting system解析油菜耐旱的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

纤维基多相态色素掺杂溶胶泡沫负载成膜及调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

低层错能镍基变形高温合金反常动态应变时效机理

国家自然科学基金

0+阅读 · 2011年12月31日

永磁同步电机非线性自适应逆解耦控制系统研究

国家自然科学基金

0+阅读 · 2011年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员