通过利用可达性分析和多面性Zonotoptops的 " 行动预测 ",通过 " 行动预测 " 实现安全加强学习 (Provably Safe Reinforcement Learning via Action Projection using Reachability Analysis and Polynomial Zonotopes) - 专知论文

会员服务 ·

0

Projection · Analysis · Learning · 强化学习 · 约束 ·

2022 年 10 月 19 日

Provably Safe Reinforcement Learning via Action Projection using Reachability Analysis and Polynomial Zonotopes

翻译：通过利用可达性分析和多面性Zonotoptops的 " 行动预测 ",通过 " 行动预测 " 实现安全加强学习

Niklas Kochdumper,Hanna Krasowski,Xiao Wang,Stanley Bak,Matthias Althoff

While reinforcement learning produces very promising results for many applications, its main disadvantage is the lack of safety guarantees, which prevents its use in safety-critical systems. In this work, we address this issue by a safety shield for nonlinear continuous systems that solve reach-avoid tasks. Our safety shield prevents applying potentially unsafe actions from a reinforcement learning agent by projecting the proposed action to the closest safe action. This approach is called action projection and is implemented via mixed-integer optimization. The safety constraints for action projection are obtained by applying parameterized reachability analysis using polynomial zonotopes, which enables to accurately capture the nonlinear effects of the actions on the system. In contrast to other state of the art approaches for action projection, our safety shield can efficiently handle input constraints and dynamic obstacles, eases incorporation of the spatial robot dimensions into the safety constraints, guarantees robust safety despite process noise and measurement errors, and is well suited for high-dimensional systems, as we demonstrate on several challenging benchmark systems.

翻译：虽然强化学习在许多应用中产生非常有希望的结果,但其主要不利之处在于缺乏安全保障,无法将其用于安全临界系统。在这项工作中,我们通过一个非线性连续系统的安全屏蔽来解决这一问题,这些系统能够解决无法达到的任务。我们的安全保障屏蔽通过将拟议行动投射到最接近的安全行动,防止从强化学习剂中应用潜在的不安全行动。这种方法称为行动预测,通过混合内网优化实施。通过使用多元化的祖诺普斯进行参数化的可达性分析,可以准确捕捉到系统上行动的非线性影响,从而获得行动预测的安全制约。与其他先进的行动预测方法不同,我们的安全保障屏蔽能够有效地处理输入限制和动态障碍,便于将空间机器人层面纳入安全制约,保证强健的安全,尽管过程噪音和测量错误,而且非常适合高维系统,我们在若干富有挑战性的基准系统上展示了这一点。

0

相关内容

Projection

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

早年应激与nectin-afadin系统调控海马环路发育与可塑性的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPARβ/δ调节nNOS对肺型氧中毒保护作用的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

ACE2-Ang-(1-7)-Mas轴通过抑制氧化应激介导肝脏脂肪变性的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向传动轴系的橡胶－硅油组合式粘弹性减振器设计方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

周期结构中波的数值模拟

国家自然科学基金

0+阅读 · 2009年12月31日

Safe Reinforcement Learning with Probabilistic Control Barrier Functions for Ramp Merging

Arxiv

0+阅读 · 2022年12月1日

Safe Learning-Based Control of Elastic Joint Robots via Control Barrier Functions

Arxiv

0+阅读 · 2022年12月1日

Dial-a-ride problem with modular platooning and en-route transfers

Arxiv

0+阅读 · 2022年12月1日

Safe Model-Free Reinforcement Learning using Disturbance-Observer-Based Control Barrier Functions

Arxiv

0+阅读 · 2022年11月30日

Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach

Arxiv

0+阅读 · 2022年11月29日

Learning Control Policies for Stochastic Systems with Reach-avoid Guarantees

Arxiv

0+阅读 · 2022年11月29日

Efficient Domain Coverage for Vehicles with Second Order Dynamics via Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年11月29日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

相关论文

Safe Reinforcement Learning with Probabilistic Control Barrier Functions for Ramp Merging

Arxiv

0+阅读 · 2022年12月1日

Safe Learning-Based Control of Elastic Joint Robots via Control Barrier Functions

Arxiv

0+阅读 · 2022年12月1日

Dial-a-ride problem with modular platooning and en-route transfers

Arxiv

0+阅读 · 2022年12月1日

Safe Model-Free Reinforcement Learning using Disturbance-Observer-Based Control Barrier Functions

Arxiv

0+阅读 · 2022年11月30日

Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach

Arxiv

0+阅读 · 2022年11月29日

Learning Control Policies for Stochastic Systems with Reach-avoid Guarantees

Arxiv

0+阅读 · 2022年11月29日

Efficient Domain Coverage for Vehicles with Second Order Dynamics via Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年11月29日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Arxiv

20+阅读 · 2021年8月30日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

早年应激与nectin-afadin系统调控海马环路发育与可塑性的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

领域驱动空间co-location模式挖掘技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPARβ/δ调节nNOS对肺型氧中毒保护作用的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

ACE2-Ang-(1-7)-Mas轴通过抑制氧化应激介导肝脏脂肪变性的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向传动轴系的橡胶－硅油组合式粘弹性减振器设计方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

周期结构中波的数值模拟

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员