限定国家机器政策调控轨迹生成器 (Finite State Machine Policies Modulating Trajectory Generator) - 专知论文

会员服务 ·

0

SimPLe · 稳健性 · 控制器 · 学成 · TG ·

2021 年 9 月 26 日

Finite State Machine Policies Modulating Trajectory Generator

翻译：限定国家机器政策调控轨迹生成器

Ren Liu,Nitish Sontakke,Sehoon Ha

Deep reinforcement learning (deep RL) has emerged as an effective tool for developing controllers for legged robots. However, a simple neural network representation is known for its poor extrapolation ability, making the learned behavior vulnerable to unseen perturbations or challenging terrains. Therefore, researchers have investigated a novel architecture, Policies Modulating Trajectory Generators (PMTG), which combines trajectory generators (TG) and feedback control signals to achieve more robust behaviors. In this work, we propose to extend the PMTG framework with a finite state machine PMTG by replacing simple TGs with asynchronous finite state machines (Async FSMs). This invention offers an explicit notion of contact events to the policy to negotiate unexpected perturbations. We demonstrated that the proposed architecture could achieve more robust behaviors in various scenarios, such as challenging terrains or external perturbations, on both simulated and real robots. The supplemental video can be found at: http://youtu.be/XUiTSZaM8f0.

翻译：深度强化学习( deep RL) 已成为开发腿式机器人控制器的有效工具。然而,一个简单的神经网络代表以其极差的外推能力而闻名于世,这使得学习到的行为容易受到不可见的扰动或具有挑战性地形的影响。因此,研究人员已经调查了一个新的结构,即“政策变换轨迹生成器(PMTG)”和反馈控制信号(PMTG),该结构将轨迹生成器(TG)和反馈控制信号结合起来,以实现更稳健的行为。在这项工作中,我们提议扩大PMTG框架,以有限的国家机器PMTG(PMTG)取代简单的TGs(PMTG ), 代之以无同步的有限状态状态机器(Async FSMSM) 。这一发明为谈判意外扰动的政策提供了一个明确的接触事件概念。我们证明,拟议的架构可以在各种情景中实现更稳健的行为,例如挑战性地形或外部扰动的模拟和真实机器人。补充视频可以在http://yotu.be/XUTITSZ8f0上找到。

0

相关内容

SimPLe

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【论文】量子对抗机器学习，Quantum Adversarial Machine Learning

【论文】量子对抗机器学习，Quantum Adversarial Machine Learning

专知会员服务

38+阅读 · 2020年1月5日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【推荐系统/计算广告/机器学习/CTR预估资料汇总】

【推荐系统/计算广告/机器学习/CTR预估资料汇总】

专知会员服务

88+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

先睹为快:神经网络顶会ICLR 2019论文热点分析

先睹为快:神经网络顶会ICLR 2019论文热点分析

深度学习与NLP

43+阅读 · 2018年12月22日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

视觉机械臂 visual-pushing-grasping

视觉机械臂 visual-pushing-grasping

CreateAMind

3+阅读 · 2018年5月25日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

A Broad-persistent Advising Approach for Deep Interactive Reinforcement Learning in Robotic Environments

Arxiv

0+阅读 · 2021年11月18日

Local Trajectory Planning For UAV Autonomous Landing

Arxiv

0+阅读 · 2021年11月18日

On the Effectiveness of Iterative Learning Control

Arxiv

0+阅读 · 2021年11月17日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks

Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks

Arxiv

6+阅读 · 2019年7月17日

Energy-Based Hindsight Experience Prioritization

Arxiv

3+阅读 · 2018年10月8日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks

Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks

Arxiv

6+阅读 · 2018年7月5日

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

Arxiv

6+阅读 · 2018年6月7日

Learning Dynamic Memory Networks for Object Tracking

Arxiv

9+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

相关VIP内容

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【论文】量子对抗机器学习，Quantum Adversarial Machine Learning

【论文】量子对抗机器学习，Quantum Adversarial Machine Learning

专知会员服务

38+阅读 · 2020年1月5日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

实时强化学习《Real-Time Reinforcement Learning》S Ramstedt, C Pal [Mila, Element AI] (2019)

专知会员服务

13+阅读 · 2019年11月17日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【推荐系统/计算广告/机器学习/CTR预估资料汇总】

【推荐系统/计算广告/机器学习/CTR预估资料汇总】

专知会员服务

88+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

先睹为快:神经网络顶会ICLR 2019论文热点分析

先睹为快:神经网络顶会ICLR 2019论文热点分析

深度学习与NLP

43+阅读 · 2018年12月22日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

视觉机械臂 visual-pushing-grasping

视觉机械臂 visual-pushing-grasping

CreateAMind

3+阅读 · 2018年5月25日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

A Broad-persistent Advising Approach for Deep Interactive Reinforcement Learning in Robotic Environments

Arxiv

0+阅读 · 2021年11月18日

Local Trajectory Planning For UAV Autonomous Landing

Arxiv

0+阅读 · 2021年11月18日

On the Effectiveness of Iterative Learning Control

Arxiv

0+阅读 · 2021年11月17日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks

Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks

Arxiv

6+阅读 · 2019年7月17日

Energy-Based Hindsight Experience Prioritization

Arxiv

3+阅读 · 2018年10月8日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks

Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks

Arxiv

6+阅读 · 2018年7月5日

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

Arxiv

6+阅读 · 2018年6月7日

Learning Dynamic Memory Networks for Object Tracking

Arxiv

9+阅读 · 2018年3月20日

微信扫码咨询专知VIP会员