使用学习系统动态的神经优化控制 (Neural Optimal Control using Learned System Dynamics) - 专知论文

会员服务 ·

0

控制器 · Learning · 优化器 · 泛函 · INTERACT ·

2023 年 2 月 20 日

Neural Optimal Control using Learned System Dynamics

翻译：使用学习系统动态的神经优化控制

Selim Engin,Volkan Isler

We study the problem of generating control laws for systems with unknown dynamics. Our approach is to represent the controller and the value function with neural networks, and to train them using loss functions adapted from the Hamilton-Jacobi-Bellman (HJB) equations. In the absence of a known dynamics model, our method first learns the state transitions from data collected by interacting with the system in an offline process. The learned transition function is then integrated to the HJB equations and used to forward simulate the control signals produced by our controller in a feedback loop. In contrast to trajectory optimization methods that optimize the controller for a single initial state, our controller can generate near-optimal control signals for initial states from a large portion of the state space. Compared to recent model-based reinforcement learning algorithms, we show that our method is more sample efficient and trains faster by an order of magnitude. We demonstrate our method in a number of tasks, including the control of a quadrotor with 12 state variables.

翻译：我们的研究是如何为动态不明的系统制定控制法的问题。我们的方法是代表控制器和神经网络的价值功能, 并使用根据汉密尔顿- 贾科比- 贝尔曼(HJB) 等式改编的损失函数对他们进行培训。在缺乏已知的动态模型的情况下, 我们的方法首先从通过离线过程与系统互动而收集的数据中学习状态转换。学习过的过渡功能随后被整合到 HJB 等式中, 并用来在反馈循环中转发我们的控制器产生的控制信号。相比之下, 我们的控制器可以使用轨迹优化方法, 优化控制器在单一初始状态中优化控制器, 我们的控制器可以从州大部分空间为初始状态生成接近最佳的控制信号。与最近的基于模型的强化学习算法相比, 我们显示我们的方法更高效, 并且以数量顺序的速度更快。我们用一系列任务展示了我们的方法, 包括以12个州变量控制。

0

相关内容

控制器

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

板带轧机主传动系统非平稳轧制条件下的机电耦合振动特性及失稳机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

下一代异构移动网络中分布式云存储的设计与研究

国家自然科学基金

1+阅读 · 2015年12月31日

秀丽线虫运动神经环路功能的全光学解析和计算模拟

国家自然科学基金

0+阅读 · 2014年12月31日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

超短脉冲强激光驱动固体电子束发射研究

国家自然科学基金

0+阅读 · 2013年12月31日

下一代无源光网络相位噪声机理及抑制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

吗啡与HIV-1gp120致海马神经元凋亡的协同作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

有界噪声激励下非线性系统的全局动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Optimizing Sensor Allocation against Attackers with Uncertain Intentions: A Worst-Case Regret Minimization Approach

Optimizing Sensor Allocation against Attackers with Uncertain Intentions: A Worst-Case Regret Minimization Approach

Arxiv

0+阅读 · 2023年4月12日

Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

Arxiv

1+阅读 · 2023年4月12日

Analytic theory for the dynamics of wide quantum neural networks

Arxiv

0+阅读 · 2023年4月12日

Neural Delay Differential Equations: System Reconstruction and Image Classification

Arxiv

0+阅读 · 2023年4月11日

Neural Laplace Control for Continuous-time Delayed Systems

Arxiv

0+阅读 · 2023年4月11日

Real-Time Model-Free Deep Reinforcement Learning for Force Control of a Series Elastic Actuator

Arxiv

0+阅读 · 2023年4月11日

Direct Estimation of Parameters in ODE Models Using WENDy: Weak-form Estimation of Nonlinear Dynamics

Arxiv

0+阅读 · 2023年4月8日

Large-Scale Regional Traffic Signal Control Using Dynamic Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月7日

Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks

Arxiv

0+阅读 · 2023年4月6日

Interpretable statistical representations of neural population dynamics and geometry

Arxiv

0+阅读 · 2023年4月6日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

相关论文

Optimizing Sensor Allocation against Attackers with Uncertain Intentions: A Worst-Case Regret Minimization Approach

Optimizing Sensor Allocation against Attackers with Uncertain Intentions: A Worst-Case Regret Minimization Approach

Arxiv

0+阅读 · 2023年4月12日

Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

Arxiv

1+阅读 · 2023年4月12日

Analytic theory for the dynamics of wide quantum neural networks

Arxiv

0+阅读 · 2023年4月12日

Neural Delay Differential Equations: System Reconstruction and Image Classification

Arxiv

0+阅读 · 2023年4月11日

Neural Laplace Control for Continuous-time Delayed Systems

Arxiv

0+阅读 · 2023年4月11日

Real-Time Model-Free Deep Reinforcement Learning for Force Control of a Series Elastic Actuator

Arxiv

0+阅读 · 2023年4月11日

Direct Estimation of Parameters in ODE Models Using WENDy: Weak-form Estimation of Nonlinear Dynamics

Arxiv

0+阅读 · 2023年4月8日

Large-Scale Regional Traffic Signal Control Using Dynamic Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年4月7日

Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks

Arxiv

0+阅读 · 2023年4月6日

Interpretable statistical representations of neural population dynamics and geometry

Arxiv

0+阅读 · 2023年4月6日

相关基金

板带轧机主传动系统非平稳轧制条件下的机电耦合振动特性及失稳机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

下一代异构移动网络中分布式云存储的设计与研究

国家自然科学基金

1+阅读 · 2015年12月31日

秀丽线虫运动神经环路功能的全光学解析和计算模拟

国家自然科学基金

0+阅读 · 2014年12月31日

Navier-Stokes 方程组的若干存在性问题

国家自然科学基金

0+阅读 · 2014年12月31日

超短脉冲强激光驱动固体电子束发射研究

国家自然科学基金

0+阅读 · 2013年12月31日

下一代无源光网络相位噪声机理及抑制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

吗啡与HIV-1gp120致海马神经元凋亡的协同作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

有界噪声激励下非线性系统的全局动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员