与斯托查搜索一起的适应性风险敏感模型预测控制 (Adaptive Risk Sensitive Model Predictive Control with Stochastic Search) - 专知论文

会员服务 ·

0

优化器 · 控制器 · MoDELS · 动力系统 · 强化学习 ·

2021 年 2 月 11 日

Adaptive Risk Sensitive Model Predictive Control with Stochastic Search

翻译：与斯托查搜索一起的适应性风险敏感模型预测控制

Ziyi Wang,Oswin So,Keuntaek Lee,Camilo A. Duarte,Evangelos A. Theodorou

We present a general framework for optimizing the Conditional Value-at-Risk for dynamical systems using stochastic search. The framework is capable of handling the uncertainty from the initial condition, stochastic dynamics, and uncertain parameters in the model. The algorithm is compared against a risk-sensitive distributional reinforcement learning framework and demonstrates outperformance on a pendulum and cartpole with stochastic dynamics. We also showcase the applicability of the framework to robotics as an adaptive risk-sensitive controller by optimizing with respect to the fully nonlinear belief provided by a particle filter on a pendulum, cartpole, and quadcopter in simulation.

翻译：我们提出了一个利用随机搜索优化动态系统有条件值风险的一般框架。框架能够处理模型初始状态、随机动态和不确定参数的不确定性。算法与风险敏感分布强化学习框架进行了比较,并展示了在带有随机动态的钟摆和马车上的表现。我们还展示了框架对作为适应性风险敏感控制器的机器人的适用性,优化了在模拟中通过粒子过滤器提供的完全非线性信念。

0

相关内容

优化器

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

多伦多大学最新《机器学习导论》课程，Introduction to Machine Learning

多伦多大学最新《机器学习导论》课程，Introduction to Machine Learning

专知会员服务

25+阅读 · 2020年9月24日

【干货书】管理统计和数据科学原理，678页pdf

【干货书】管理统计和数据科学原理，678页pdf

专知会员服务

186+阅读 · 2020年7月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

专知会员服务

6+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

专知

5+阅读 · 2018年1月19日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Learning of Causal Observable Functions for Koopman-DFL Lifting Linearization of Nonlinear Controlled Systems and Its Application to Excavation Automation

Learning of Causal Observable Functions for Koopman-DFL Lifting Linearization of Nonlinear Controlled Systems and Its Application to Excavation Automation

Arxiv

0+阅读 · 2021年4月5日

Reinforcement Learning with Temporal Logic Constraints for Partially-Observable Markov Decision Processes

Arxiv

0+阅读 · 2021年4月4日

Kernel-based parameter estimation of dynamical systems with unknown observation functions

Arxiv

0+阅读 · 2021年4月4日

SEER: Performance-Aware Leader Election in Single-Leader Consensus

Arxiv

0+阅读 · 2021年4月3日

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年4月2日

Optimal Control of a Soft CyberOctopus Arm

Arxiv

0+阅读 · 2021年4月1日

NOVAS: Non-convex Optimization via Adaptive Stochastic Search for End-to-End Learning and Control

Arxiv

0+阅读 · 2021年4月1日

Multi-Stage Document Ranking with BERT

Arxiv

5+阅读 · 2019年10月31日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

多伦多大学最新《机器学习导论》课程，Introduction to Machine Learning

多伦多大学最新《机器学习导论》课程，Introduction to Machine Learning

专知会员服务

25+阅读 · 2020年9月24日

【干货书】管理统计和数据科学原理，678页pdf

【干货书】管理统计和数据科学原理，678页pdf

专知会员服务

186+阅读 · 2020年7月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

专知会员服务

6+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维与高维空间中对潜在表征的分析、建模与变换

《美军使用大语言模型技术生成领域特定文档》2025最新379页

【NeurIPS 2025】以语言为中心的全模态表征学习的可扩展性研究

智能体化多模态大语言模型综述

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

【论文推荐】最新5篇深度强化学习相关论文推荐—经验驱动的网络、自动数据库管理、双光技术推荐系统、UAVs、多代理竞争对手

专知

5+阅读 · 2018年1月19日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Learning of Causal Observable Functions for Koopman-DFL Lifting Linearization of Nonlinear Controlled Systems and Its Application to Excavation Automation

Learning of Causal Observable Functions for Koopman-DFL Lifting Linearization of Nonlinear Controlled Systems and Its Application to Excavation Automation

Arxiv

0+阅读 · 2021年4月5日

Reinforcement Learning with Temporal Logic Constraints for Partially-Observable Markov Decision Processes

Arxiv

0+阅读 · 2021年4月4日

Kernel-based parameter estimation of dynamical systems with unknown observation functions

Arxiv

0+阅读 · 2021年4月4日

SEER: Performance-Aware Leader Election in Single-Leader Consensus

Arxiv

0+阅读 · 2021年4月3日

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年4月2日

Optimal Control of a Soft CyberOctopus Arm

Arxiv

0+阅读 · 2021年4月1日

NOVAS: Non-convex Optimization via Adaptive Stochastic Search for End-to-End Learning and Control

Arxiv

0+阅读 · 2021年4月1日

Multi-Stage Document Ranking with BERT

Arxiv

5+阅读 · 2019年10月31日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员