不这样做:以基于规则的指导加强加强学习 (Don't do it: Safer Reinforcement Learning With Rule-based Guidance) - 专知论文

会员服务 ·

0

Learning · Guidance · Performer · 强化学习 · Agent ·

2022 年 12 月 28 日

Don't do it: Safer Reinforcement Learning With Rule-based Guidance

翻译：不这样做:以基于规则的指导加强加强学习

Ekaterina Nikonova,Cheng Xue,Jochen Renz

During training, reinforcement learning systems interact with the world without considering the safety of their actions. When deployed into the real world, such systems can be dangerous and cause harm to their surroundings. Often, dangerous situations can be mitigated by defining a set of rules that the system should not violate under any conditions. For example, in robot navigation, one safety rule would be to avoid colliding with surrounding objects and people. In this work, we define safety rules in terms of the relationships between the agent and objects and use them to prevent reinforcement learning systems from performing potentially harmful actions. We propose a new safe epsilon-greedy algorithm that uses safety rules to override agents' actions if they are considered to be unsafe. In our experiments, we show that a safe epsilon-greedy policy significantly increases the safety of the agent during training, improves the learning efficiency resulting in much faster convergence, and achieves better performance than the base model.

翻译：在培训期间,强化学习系统与世界互动,而不考虑其行动的安全性。当这些系统被部署到现实世界时,它们可能是危险的,会对其周围环境造成伤害。通常,危险情况可以通过确定一套系统在任何条件下都不应违反的规则来缓解。例如,在机器人导航中,一项安全规则是避免与周围物体和人员发生碰撞。在这项工作中,我们从代理人和物体之间的关系的角度来界定安全规则,并利用这些规则防止强化学习系统进行可能有害的行动。我们建议一种新的安全百合算法,在认为不安全时使用安全规则来取代代理人的行动。我们在实验中显示,一项安全的易合金政策将大大提高代理人在培训期间的安全性,提高学习效率,从而更快地实现趋同,并取得比基本模式更好的业绩。

0

相关内容

Learning

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

同型半胱氨酸经ERK通路上调ETB受体表达促血管平滑肌细胞增殖机制

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

晶场调制宽带吸收的掺镨钆镓石榴石橙黄光激光晶体生长与性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

染料敏化太阳电池中的纳晶表面有机分子组装、微结构及动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

订单装配型供应链环境下的个性化产品配置及优化建模研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于机器嗅觉的生物发酵过程在线检测与分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Ag掺杂锰基稀土氧化物LIV效应机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

Li、Ta掺杂KNN无铅压电单晶的TSSG法生长及压电性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年2月27日

Optimizing Crop Management with Reinforcement Learning and Imitation Learning

Arxiv

0+阅读 · 2023年2月27日

Dynamic Resource Allocation for Metaverse Applications with Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年2月27日

Sim-and-Real Reinforcement Learning for Manipulation: A Consensus-based Approach

Arxiv

0+阅读 · 2023年2月26日

Reinforcement Learning based Autonomous Multi-Rotor Landing on Moving Platforms

Arxiv

0+阅读 · 2023年2月25日

On Bellman's principle of optimality and Reinforcement learning for safety-constrained Markov decision process

Arxiv

0+阅读 · 2023年2月25日

Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms

Arxiv

0+阅读 · 2023年2月23日

Distributed Deep Reinforcement Learning: A Survey and A Multi-Player Multi-Agent Learning Toolbox

Arxiv

11+阅读 · 2022年12月1日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

VIP会员

文章信息

相关主题

相关VIP内容

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年2月27日

Optimizing Crop Management with Reinforcement Learning and Imitation Learning

Arxiv

0+阅读 · 2023年2月27日

Dynamic Resource Allocation for Metaverse Applications with Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年2月27日

Sim-and-Real Reinforcement Learning for Manipulation: A Consensus-based Approach

Arxiv

0+阅读 · 2023年2月26日

Reinforcement Learning based Autonomous Multi-Rotor Landing on Moving Platforms

Arxiv

0+阅读 · 2023年2月25日

On Bellman's principle of optimality and Reinforcement learning for safety-constrained Markov decision process

Arxiv

0+阅读 · 2023年2月25日

Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms

Arxiv

0+阅读 · 2023年2月23日

Distributed Deep Reinforcement Learning: A Survey and A Multi-Player Multi-Agent Learning Toolbox

Arxiv

11+阅读 · 2022年12月1日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

同型半胱氨酸经ERK通路上调ETB受体表达促血管平滑肌细胞增殖机制

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

晶场调制宽带吸收的掺镨钆镓石榴石橙黄光激光晶体生长与性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

染料敏化太阳电池中的纳晶表面有机分子组装、微结构及动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

订单装配型供应链环境下的个性化产品配置及优化建模研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于机器嗅觉的生物发酵过程在线检测与分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Ag掺杂锰基稀土氧化物LIV效应机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

Li、Ta掺杂KNN无铅压电单晶的TSSG法生长及压电性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员