基于搜索的强化学习测试 (Search-Based Testing of Reinforcement Learning) - 专知论文

会员服务 ·

0

Performer · 迹 · 学成 · 强化学习 · 讲稿 ·

2022 年 5 月 7 日

Search-Based Testing of Reinforcement Learning

翻译：基于搜索的强化学习测试

Martin Tappler,Filip Cano Córdoba,Bernhard K. Aichernig,Bettina Könighofer

Evaluation of deep reinforcement learning (RL) is inherently challenging. Especially the opaqueness of learned policies and the stochastic nature of both agents and environments make testing the behavior of deep RL agents difficult. We present a search-based testing framework that enables a wide range of novel analysis capabilities for evaluating the safety and performance of deep RL agents. For safety testing, our framework utilizes a search algorithm that searches for a reference trace that solves the RL task. The backtracking states of the search, called boundary states, pose safety-critical situations. We create safety test-suites that evaluate how well the RL agent escapes safety-critical situations near these boundary states. For robust performance testing, we create a diverse set of traces via fuzz testing. These fuzz traces are used to bring the agent into a wide variety of potentially unknown states from which the average performance of the agent is compared to the average performance of the fuzz traces. We apply our search-based testing approach on RL for Nintendo's Super Mario Bros.

翻译：深层强化学习(RL)的评估具有内在的挑战性。特别是,所学政策的不透明性以及代理物和环境的随机性质使得测试深层代理物的行为变得困难。我们提出了一个基于搜索的测试框架, 使得能够对深层RL代理物的安全和性能进行广泛的新颖分析。在安全测试方面, 我们的框架使用搜索算法, 搜索参考线索, 解决了 RL 的任务。搜索的回溯跟踪状态, 被称为边界状态, 构成了安全危急情况。我们创建了安全测试工具, 评估RL 代理物在这些边界州附近安全危急情况下的越轨情况有多好。为了进行严格的性能测试, 我们通过模糊测试, 创建了一套多样的痕迹。这些模糊痕迹被用来将代理物的平均性能与模糊痕迹的平均性能进行比较。我们用基于搜索的RL方法, 用于Nintendo's Superrio Mario Bros。

0

相关内容

Performer

【MIT Sam Hopkins】如何读论文？How to Read a Paper

【MIT Sam Hopkins】如何读论文？How to Read a Paper

专知会员服务

108+阅读 · 2022年3月20日

【如何做研究】How to research ，22页ppt

【如何做研究】How to research ，22页ppt

专知会员服务

112+阅读 · 2021年4月17日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

高阶微分方程的周期解及多重性

国家自然科学基金

0+阅读 · 2015年12月31日

大脑后顶叶皮层内的空间编码和多感觉整合

国家自然科学基金

1+阅读 · 2014年12月31日

MicroRNA调控BACE1在AD发病中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

RNA结合蛋白Smaug识别果蝇生殖发育关键基因oskar mRNA的结构机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

GSK-3β/β-catenin信号通路参与ARDS后认知功能障碍发生的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

ErbB信号通路基因GAB1、EPS15异常表达与心肌细胞凋亡及心脏发育异常的相关性研究

国家自然科学基金

0+阅读 · 2013年12月31日

AGEs-MAPK-ENaCs信号通路在高血压水盐代谢中的调控作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

可溶性表氧化物水解酶在血管内皮细胞中的转录后调节

国家自然科学基金

0+阅读 · 2009年12月31日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Arxiv

34+阅读 · 2022年6月30日

How to Leverage Unlabeled Data in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice

Arxiv

0+阅读 · 2022年6月29日

X-Risk Analysis for AI Research

Arxiv

0+阅读 · 2022年6月29日

DeepTPI: Test Point Insertion with Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年6月27日

Transformers are Meta-Reinforcement Learners

Arxiv

15+阅读 · 2022年6月14日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Deep Reinforcement Learning: An Overview

Deep Reinforcement Learning: An Overview

Arxiv

17+阅读 · 2018年11月26日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

VIP会员

文章信息

相关主题

相关VIP内容

【MIT Sam Hopkins】如何读论文？How to Read a Paper

【MIT Sam Hopkins】如何读论文？How to Read a Paper

专知会员服务

108+阅读 · 2022年3月20日

【如何做研究】How to research ，22页ppt

【如何做研究】How to research ，22页ppt

专知会员服务

112+阅读 · 2021年4月17日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Arxiv

34+阅读 · 2022年6月30日

How to Leverage Unlabeled Data in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice

Arxiv

0+阅读 · 2022年6月29日

X-Risk Analysis for AI Research

Arxiv

0+阅读 · 2022年6月29日

DeepTPI: Test Point Insertion with Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年6月27日

Transformers are Meta-Reinforcement Learners

Arxiv

15+阅读 · 2022年6月14日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Deep Reinforcement Learning: An Overview

Deep Reinforcement Learning: An Overview

Arxiv

17+阅读 · 2018年11月26日

Video Captioning via Hierarchical Reinforcement Learning

Arxiv

20+阅读 · 2018年3月29日

相关基金

高阶微分方程的周期解及多重性

国家自然科学基金

0+阅读 · 2015年12月31日

大脑后顶叶皮层内的空间编码和多感觉整合

国家自然科学基金

1+阅读 · 2014年12月31日

MicroRNA调控BACE1在AD发病中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

RNA结合蛋白Smaug识别果蝇生殖发育关键基因oskar mRNA的结构机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

GSK-3β/β-catenin信号通路参与ARDS后认知功能障碍发生的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

ErbB信号通路基因GAB1、EPS15异常表达与心肌细胞凋亡及心脏发育异常的相关性研究

国家自然科学基金

0+阅读 · 2013年12月31日

AGEs-MAPK-ENaCs信号通路在高血压水盐代谢中的调控作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

可溶性表氧化物水解酶在血管内皮细胞中的转录后调节

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员