Provably Safe Reinforcement Learning: A Theoretical and Experimental Comparison - 专知论文

会员服务 ·

0

Performer · Learning · Continuity · 求逆 · 离散化 ·

2023 年 5 月 9 日

Provably Safe Reinforcement Learning: A Theoretical and Experimental Comparison

翻译：暂无翻译

Hanna Krasowski,Jakob Thumm,Marlon Müller,Lukas Schäfer,Xiao Wang,Matthias Althoff

Ensuring safety of reinforcement learning (RL) algorithms is crucial to unlock their potential for many real-world tasks. However, vanilla RL does not guarantee safety. In recent years, several methods have been proposed to provide safety guarantees for RL by design. Yet, there is no comprehensive comparison of these provably safe RL methods. We therefore introduce a categorization of existing provably safe RL methods, present the theoretical foundations for both continuous and discrete action spaces, and benchmark the methods' performance empirically. The methods are categorized based on how the action is adapted by the safety method: action replacement, action projection, and action masking. Our experiments on an inverted pendulum and quadrotor stabilization task show that all provably safe methods are indeed always safe. Furthermore, their trained performance is comparable to unsafe baselines. The benchmarking suggests that different provably safe RL approaches should be selected depending on safety specifications, RL algorithms, and type of action space.

翻译：暂无翻译

0

相关内容

Performer

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

聚精氨酸诱导肿瘤微环境的免疫活性及逆转cetuximab耐药性的调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

原发性胆汁性肝硬化中肝纤维化相关的microRNA筛选及功能初步研究

国家自然科学基金

0+阅读 · 2015年12月31日

地质样品Ce4+/Ce3+比值分析及其应用：以藏东玉龙斑岩铜矿为例研究岩浆相对氧化还原状态与斑岩矿床形成关系

国家自然科学基金

0+阅读 · 2014年12月31日

microRNA-155对口腔扁平苔藓Th细胞功能的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

肝移植胆道缺血再灌注损伤时血红素氧合酶-1介导的保护机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

拓扑半金属Sb薄膜的分子束外延生长、能带结构调控和原位同步辐射ARPES研究

国家自然科学基金

0+阅读 · 2012年12月31日

Kupffer细胞上GITRL在大鼠肝移植免疫耐受重建中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

1.94 um波段Tm:Ho共掺石英基全光纤飞秒脉冲激光技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

体外构建角膜内皮细胞膜片行后弹力层内皮移植后的功能评价

国家自然科学基金

0+阅读 · 2011年12月31日

干预periostin表达对瘢痕疙瘩和正常皮肤成纤维细胞功能的影响

国家自然科学基金

0+阅读 · 2009年12月31日

Active Coverage for PAC Reinforcement Learning

Arxiv

0+阅读 · 2023年6月23日

Safe Risk-averse Bayesian Optimization for Controller Tuning

Arxiv

0+阅读 · 2023年6月23日

Dual RL: Unification and New Methods for Reinforcement and Imitation Learning

Arxiv

0+阅读 · 2023年6月22日

GUARD: A Safe Reinforcement Learning Benchmark

Arxiv

0+阅读 · 2023年6月20日

Value Gradient weighted Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2023年6月20日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

VIP会员

文章信息

相关主题

相关VIP内容

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【AAAI2026】Align3GR：面向 LLM 生成式推荐的统一多层次对齐方法

多智能体强化学习中的稳健且高效的通信

【博士论文】通过判别式与生成式学习方法推进 3D场景理解

DeepSeek 实践：大模型部署、微调与应用

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Active Coverage for PAC Reinforcement Learning

Arxiv

0+阅读 · 2023年6月23日

Safe Risk-averse Bayesian Optimization for Controller Tuning

Arxiv

0+阅读 · 2023年6月23日

Dual RL: Unification and New Methods for Reinforcement and Imitation Learning

Arxiv

0+阅读 · 2023年6月22日

GUARD: A Safe Reinforcement Learning Benchmark

Arxiv

0+阅读 · 2023年6月20日

Value Gradient weighted Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2023年6月20日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

相关基金

聚精氨酸诱导肿瘤微环境的免疫活性及逆转cetuximab耐药性的调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

原发性胆汁性肝硬化中肝纤维化相关的microRNA筛选及功能初步研究

国家自然科学基金

0+阅读 · 2015年12月31日

地质样品Ce4+/Ce3+比值分析及其应用：以藏东玉龙斑岩铜矿为例研究岩浆相对氧化还原状态与斑岩矿床形成关系

国家自然科学基金

0+阅读 · 2014年12月31日

microRNA-155对口腔扁平苔藓Th细胞功能的调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

肝移植胆道缺血再灌注损伤时血红素氧合酶-1介导的保护机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

拓扑半金属Sb薄膜的分子束外延生长、能带结构调控和原位同步辐射ARPES研究

国家自然科学基金

0+阅读 · 2012年12月31日

Kupffer细胞上GITRL在大鼠肝移植免疫耐受重建中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

1.94 um波段Tm:Ho共掺石英基全光纤飞秒脉冲激光技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

体外构建角膜内皮细胞膜片行后弹力层内皮移植后的功能评价

国家自然科学基金

0+阅读 · 2011年12月31日

干预periostin表达对瘢痕疙瘩和正常皮肤成纤维细胞功能的影响

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员