扩展环境以衡量加强学习中的自我反省 (Extending Environments To Measure Self-Reflection In Reinforcement Learning) - 专知论文

会员服务 ·

0

回合 · Performer · 强化学习 · 学成 · 估计/估计量 ·

2022 年 1 月 11 日

Extending Environments To Measure Self-Reflection In Reinforcement Learning

翻译：扩展环境以衡量加强学习中的自我反省

Samuel Allen Alexander,Michael Castaneda,Kevin Compher,Oscar Martinez

We consider an extended notion of reinforcement learning in which the environment can simulate the agent and base its outputs on the agent's hypothetical behavior. Since good performance usually requires paying attention to whatever things the environment's outputs are based on, we argue that for an agent to achieve on-average good performance across many such extended environments, it is necessary for the agent to self-reflect. Thus, an agent's self-reflection ability can be numerically estimated by running the agent through a battery of extended environments. We are simultaneously releasing an open-source library of extended environments to serve as proof-of-concept of this technique. As the library is first-of-kind, we have avoided the difficult problem of optimizing it. Instead we have chosen environments with interesting properties. Some seem paradoxical, some lead to interesting thought experiments, some are even suggestive of how self-reflection might have evolved in nature. We give examples and introduce a simple transformation which experimentally seems to increase self-reflection.

翻译：我们考虑扩大强化学习的概念,环境可以在其中模拟代理人的假设行为,并根据代理人的假设行为来模拟其产出。由于良好的表现通常要求注意环境产出所依据的任何事物,因此我们争辩说,要使代理人在许多如此大的环境中取得平均的良好业绩,就必须使代理人能够自我反省。因此,代理人的自我反省能力可以通过在扩大环境的电池中运行代理人来进行数字估计。我们正在同时释放一个开放源的扩展环境图书馆,作为这种技术的证明。由于图书馆是同类图书馆,我们避免了优化它的困难问题。我们选择了具有有趣特性的环境。有些似乎自相矛盾,有些导致有趣的思考实验,有些甚至暗示了自我反省在性质上是如何演进的。我们举例并引入一种简单的转变,实验性地似乎会增加自我反省。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【强化学习轻松入门】《Reinforcement Learning 101》，Shweta Bhatt

【强化学习轻松入门】《Reinforcement Learning 101》，Shweta Bhatt

专知会员服务

50+阅读 · 2020年1月3日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

基于深度信念网络的高光谱遥感影像变化检测方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于大气偏振特性的载体三维空间自主姿态测量理论与方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

高维复杂结构数据降维

国家自然科学基金

10+阅读 · 2014年12月31日

撞击密度和子空间聚类在AGC伺服缸功能精度诊断中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

人类复杂疾病连锁不平衡基因定位的统计方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高效免标记功能核酸传感器研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型荧光微结构光纤二氧化碳传感器

国家自然科学基金

0+阅读 · 2012年12月31日

基于视/听信息融合的复杂机电系统运行状态监测原理与方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于对象层匹配的多时相移动测量变化检测理论与方法

国家自然科学基金

0+阅读 · 2009年12月31日

基于假互补肽核酸的纳米光纤集成式表面等离子体共振基因芯片检测技术的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments

Arxiv

0+阅读 · 2022年4月20日

Understanding and Preventing Capacity Loss in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

A sojourn-based approach to semi-Markov Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Reinforcement Learning Control of a Biomechanical Model of the Upper Extremity

Arxiv

0+阅读 · 2022年4月20日

When Is Partially Observable Reinforcement Learning Not Scary?

Arxiv

0+阅读 · 2022年4月19日

CryoRL: Reinforcement Learning Enables Efficient Cryo-EM Data Collection

CryoRL: Reinforcement Learning Enables Efficient Cryo-EM Data Collection

Arxiv

0+阅读 · 2022年4月15日

A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow

Arxiv

0+阅读 · 2022年4月15日

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Arxiv

1+阅读 · 2022年4月14日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

79+阅读 · 2020年1月19日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【强化学习轻松入门】《Reinforcement Learning 101》，Shweta Bhatt

【强化学习轻松入门】《Reinforcement Learning 101》，Shweta Bhatt

专知会员服务

50+阅读 · 2020年1月3日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments

Arxiv

0+阅读 · 2022年4月20日

Understanding and Preventing Capacity Loss in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

A sojourn-based approach to semi-Markov Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Reinforcement Learning Control of a Biomechanical Model of the Upper Extremity

Arxiv

0+阅读 · 2022年4月20日

When Is Partially Observable Reinforcement Learning Not Scary?

Arxiv

0+阅读 · 2022年4月19日

CryoRL: Reinforcement Learning Enables Efficient Cryo-EM Data Collection

CryoRL: Reinforcement Learning Enables Efficient Cryo-EM Data Collection

Arxiv

0+阅读 · 2022年4月15日

A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow

Arxiv

0+阅读 · 2022年4月15日

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Arxiv

1+阅读 · 2022年4月14日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

79+阅读 · 2020年1月19日

相关基金

基于深度信念网络的高光谱遥感影像变化检测方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于大气偏振特性的载体三维空间自主姿态测量理论与方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

高维复杂结构数据降维

国家自然科学基金

10+阅读 · 2014年12月31日

撞击密度和子空间聚类在AGC伺服缸功能精度诊断中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

人类复杂疾病连锁不平衡基因定位的统计方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高效免标记功能核酸传感器研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型荧光微结构光纤二氧化碳传感器

国家自然科学基金

0+阅读 · 2012年12月31日

基于视/听信息融合的复杂机电系统运行状态监测原理与方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于对象层匹配的多时相移动测量变化检测理论与方法

国家自然科学基金

0+阅读 · 2009年12月31日

基于假互补肽核酸的纳米光纤集成式表面等离子体共振基因芯片检测技术的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员