KCRL: Krasovskii-Krasovskii-受过训练的加强学习,非线性动态系统有保障稳定 (KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems) - 专知论文

会员服务 ·

0

Learning · 动力系统 · Lyapunov · 强化学习 · 样本复杂度 ·

2022 年 6 月 3 日

KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems

翻译：KCRL: Krasovskii-Krasovskii-受过训练的加强学习,非线性动态系统有保障稳定

Sahin Lale,Yuanyuan Shi,Guannan Qu,Kamyar Azizzadenesheli,Adam Wierman,Anima Anandkumar

Learning a dynamical system requires stabilizing the unknown dynamics to avoid state blow-ups. However, current reinforcement learning (RL) methods lack stabilization guarantees, which limits their applicability for the control of safety-critical systems. We propose a model-based RL framework with formal stability guarantees, Krasovskii Constrained RL (KCRL), that adopts Krasovskii's family of Lyapunov functions as a stability constraint. The proposed method learns the system dynamics up to a confidence interval using feature representation, e.g. Random Fourier Features. It then solves a constrained policy optimization problem with a stability constraint based on Krasovskii's method using a primal-dual approach to recover a stabilizing policy. We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system. We also derive the sample complexity upper bound for stabilization of unknown nonlinear dynamical systems via the KCRL framework.

翻译：学习动态系统需要稳定未知的动态动态,以避免国家爆破。但是,当前的强化学习方法缺乏稳定保障,限制了其安全临界系统控制的适用性。我们提出一个基于模型的RL框架,并有正式的稳定保障, Krasovskii Constraced RL(KCRL), 将Krasovskii的Lyapunov家族功能当作稳定性制约。拟议的方法利用特征代表( 如随机 Fourier 功能) 来学习系统动态, 直至信任间隔。然后, 以Krasovskii 的方法为基础, 解决一个受限的政策优化问题, 并基于Krasovskii 方法, 使用原始双元方法恢复稳定政策。我们证明, KCRL 保证在一定数量的与基本未知系统互动中学习稳定政策。我们还通过 KCRL 框架获取未知的非线性动态系统稳定性系统样本的复杂性上限。

0

相关内容

Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

专知会员服务

41+阅读 · 2019年12月27日

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

专知会员服务

79+阅读 · 2019年11月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

新型MOF电光材料的制备及共轭小分子与MOF框架间相互作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

空间分数阶Schr？dinger方程的时间分裂谱方法

国家自然科学基金

0+阅读 · 2014年12月31日

铸造高硼高速钢硼碳化物调控及其耐磨性研究

国家自然科学基金

0+阅读 · 2014年12月31日

不确定多管火箭多体系统动力学控制机理、方法及实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

多孔介质中的Brinkman-Forchheimer方程解的稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Rossby波产生纬向流的动力学机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

Dyrk1A调控CaMKⅡ#948;的可变剪接及其在心脏重构过程中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Strategy Synthesis for Zero-sum Neuro-symbolic Concurrent Stochastic Games (Extended Version)

Strategy Synthesis for Zero-sum Neuro-symbolic Concurrent Stochastic Games (Extended Version)

Arxiv

0+阅读 · 2022年7月21日

Multi-Asset Closed-Loop Reservoir Management Using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月21日

UAV Trajectory, User Association and Power Control for Multi-UAV Enabled Energy Harvesting Communications: Offline Design and Online Reinforcement Learning

Arxiv

0+阅读 · 2022年7月21日

Learning to Solve Soft-Constrained Vehicle Routing Problems with Lagrangian Relaxation

Arxiv

0+阅读 · 2022年7月20日

New Auction Algorithms for Path Planning, Network Transport, and Reinforcement Learning

Arxiv

0+阅读 · 2022年7月19日

Policy Optimization for Markov Games: Unified Framework and Faster Convergence

Arxiv

0+阅读 · 2022年7月19日

Magpie: Automatically Tuning Static Parameters for Distributed File Systems using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月19日

Actor-Critic based Improper Reinforcement Learning

Arxiv

0+阅读 · 2022年7月19日

Online Learning with Off-Policy Feedback

Arxiv

0+阅读 · 2022年7月18日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

35+阅读 · 2020年9月3日

VIP会员

文章信息

相关主题

样本复杂度

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

专知会员服务

41+阅读 · 2019年12月27日

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

专知会员服务

79+阅读 · 2019年11月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Strategy Synthesis for Zero-sum Neuro-symbolic Concurrent Stochastic Games (Extended Version)

Strategy Synthesis for Zero-sum Neuro-symbolic Concurrent Stochastic Games (Extended Version)

Arxiv

0+阅读 · 2022年7月21日

Multi-Asset Closed-Loop Reservoir Management Using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月21日

UAV Trajectory, User Association and Power Control for Multi-UAV Enabled Energy Harvesting Communications: Offline Design and Online Reinforcement Learning

Arxiv

0+阅读 · 2022年7月21日

Learning to Solve Soft-Constrained Vehicle Routing Problems with Lagrangian Relaxation

Arxiv

0+阅读 · 2022年7月20日

New Auction Algorithms for Path Planning, Network Transport, and Reinforcement Learning

Arxiv

0+阅读 · 2022年7月19日

Policy Optimization for Markov Games: Unified Framework and Faster Convergence

Arxiv

0+阅读 · 2022年7月19日

Magpie: Automatically Tuning Static Parameters for Distributed File Systems using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月19日

Actor-Critic based Improper Reinforcement Learning

Arxiv

0+阅读 · 2022年7月19日

Online Learning with Off-Policy Feedback

Arxiv

0+阅读 · 2022年7月18日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

35+阅读 · 2020年9月3日

相关基金

新型MOF电光材料的制备及共轭小分子与MOF框架间相互作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

空间分数阶Schr？dinger方程的时间分裂谱方法

国家自然科学基金

0+阅读 · 2014年12月31日

铸造高硼高速钢硼碳化物调控及其耐磨性研究

国家自然科学基金

0+阅读 · 2014年12月31日

不确定多管火箭多体系统动力学控制机理、方法及实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

多孔介质中的Brinkman-Forchheimer方程解的稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Rossby波产生纬向流的动力学机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

Dyrk1A调控CaMKⅡ#948;的可变剪接及其在心脏重构过程中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员