利用安全指数指导高斯进程模式加强学习的概率保障 (Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models) - 专知论文

会员服务 ·

0

Learning · MoDELS · 控制器 · Agent · Processing（编程语言） ·

2022 年 11 月 29 日

Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

翻译：利用安全指数指导高斯进程模式加强学习的概率保障

Weiye Zhao,Tairan He,Changliu Liu

from arxiv, First paper to use Gaussian Process for providing safety guarantee in energy-based safe control

Safety is one of the biggest concerns to applying reinforcement learning (RL) to the physical world. In its core part, it is challenging to ensure RL agents persistently satisfy a hard state constraint without white-box or black-box dynamics models. This paper presents an integrated model learning and safe control framework to safeguard any agent, where its dynamics are learned as Gaussian processes. The proposed theory provides (i) a novel method to construct an offline dataset for model learning that best achieves safety requirements; (ii) a parameterization rule for safety index to ensure the existence of safe control; (iii) a safety guarantee in terms of probabilistic forward invariance when the model is learned using the aforementioned dataset. Simulation results show that our framework guarantees almost zero safety violation on various continuous control tasks.

翻译：安全是将强化学习(RL)应用到物理世界的最大关切之一。在其核心部分,确保RL代理物持续满足硬性国家约束,而不使用白箱或黑盒动态模型,具有挑战性。本文件提出了一个综合示范学习和安全控制框架,以保障任何代理物的安全,因为其动态是作为Gaussian过程学习的。拟议的理论提供了(一) 为模型学习构建离线数据集的新颖方法,以最能达到安全要求;(二) 安全指数参数化规则,以确保安全控制的存在;(三) 在模型使用上述数据集学习时,从概率性前向变化的角度提供安全保障。模拟结果表明,我们的框架保证在各种连续控制任务上几乎不会发生安全侵犯。

0

相关内容

Learning

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

基于分子成像的非小细胞肺癌EGFR在体分子分型研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

高能辐照对聚变磁体绝缘材料低温电绝缘特性的影响机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

多铁性复合材料的力磁电多场耦合与结构特性

国家自然科学基金

0+阅读 · 2012年12月31日

基于分子印迹技术的多相类Fenton的选择性催化氧化及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

新型胃癌血管靶向探针PEG-(GX1)2的制备及用于胃癌早期诊断的研究

国家自然科学基金

0+阅读 · 2012年12月31日

导电物质输运性质的高温高压原位测试技术

国家自然科学基金

0+阅读 · 2011年12月31日

离子通道TRPM2在血管壁内膜增生中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

一维半导体纳米材料与纳米器件的多场耦合损伤与失效研究

国家自然科学基金

0+阅读 · 2011年12月31日

DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models

DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models

Arxiv

0+阅读 · 2023年2月1日

Identifiability and inference for copula-based semiparametric models for random vectors with arbitrary marginal distributions

Arxiv

0+阅读 · 2023年1月31日

Generalized sparse Bayesian learning and application to image reconstruction

Arxiv

0+阅读 · 2023年1月27日

Lifelong Reinforcement Learning with Modulating Masks

Arxiv

0+阅读 · 2023年1月27日

Solving Constrained Reinforcement Learning through Augmented State and Reward Penalties

Arxiv

0+阅读 · 2023年1月27日

Demystifying Reinforcement Learning in Time-Varying Systems

Arxiv

0+阅读 · 2023年1月26日

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Arxiv

0+阅读 · 2023年1月26日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS 2025】稳定电影度量：面向专业视频生成的结构化分类与评测体系

战场AI决策支持系统

【博士论文】面向排序与扩散模型的安全、高效与鲁棒强化学习

面向 AI 生成图像的安全与鲁棒水印：全面综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models

DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models

Arxiv

0+阅读 · 2023年2月1日

Identifiability and inference for copula-based semiparametric models for random vectors with arbitrary marginal distributions

Arxiv

0+阅读 · 2023年1月31日

Generalized sparse Bayesian learning and application to image reconstruction

Arxiv

0+阅读 · 2023年1月27日

Lifelong Reinforcement Learning with Modulating Masks

Arxiv

0+阅读 · 2023年1月27日

Solving Constrained Reinforcement Learning through Augmented State and Reward Penalties

Arxiv

0+阅读 · 2023年1月27日

Demystifying Reinforcement Learning in Time-Varying Systems

Arxiv

0+阅读 · 2023年1月26日

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Arxiv

0+阅读 · 2023年1月26日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

基于分子成像的非小细胞肺癌EGFR在体分子分型研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

高能辐照对聚变磁体绝缘材料低温电绝缘特性的影响机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

多铁性复合材料的力磁电多场耦合与结构特性

国家自然科学基金

0+阅读 · 2012年12月31日

基于分子印迹技术的多相类Fenton的选择性催化氧化及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

新型胃癌血管靶向探针PEG-(GX1)2的制备及用于胃癌早期诊断的研究

国家自然科学基金

0+阅读 · 2012年12月31日

导电物质输运性质的高温高压原位测试技术

国家自然科学基金

0+阅读 · 2011年12月31日

离子通道TRPM2在血管壁内膜增生中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

一维半导体纳米材料与纳米器件的多场耦合损伤与失效研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员