深强化学习中零射概观调查 (A Survey of Zero-shot Generalisation in Deep Reinforcement Learning) - 专知论文

会员服务 ·

0

Learning · 深度强化学习 · 强化学习 · 评论员 · 生成方法 ·

2023 年 1 月 19 日

A Survey of Zero-shot Generalisation in Deep Reinforcement Learning

翻译：深强化学习中零射概观调查

Robert Kirk,Amy Zhang,Edward Grefenstette,Tim Rocktäschel

from arxiv, JAIR version. Added formal definitions of ZSPT and related concepts, JAIR formatting, other small rewrites; https://www.jair.org/index.php/jair/article/view/14174

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We rely on a unifying formalism and terminology for discussing different ZSG problems, building upon previous works. We go on to categorise existing benchmarks for ZSG, as well as current methods for tackling these problems. Finally, we provide a critical discussion of the current state of the field, including recommendations for future work. Among other conclusions, we argue that taking a purely procedural content generation approach to benchmark design is not conducive to progress in ZSG, we suggest fast online adaptation and tackling RL-specific problems as some areas for future work on methods for ZSG, and we recommend building benchmarks in underexplored problem settings such as offline RL ZSG and reward-function variation.

翻译：在深入加强学习中零点概括(ZSG)的研究旨在产生RL算法,其政策在部署时非常概括于新奇的无形情况,避免过度适应其培训环境。如果我们要在现实世界情景中部署强化学习算法(环境将是多样化、动态和不可预测的),那么解决这一问题至关重要。这项调查是对这一新生领域的概览。我们依靠一种统一的形式主义和术语来讨论不同的ZSG问题,在以往工作的基础上更进一步。我们接着将ZSG的现有基准和解决这些问题的现有方法分类。最后,我们对实地现状进行批判性讨论,包括对未来工作的建议。除其他结论外,我们认为,采用纯粹程序内容生成方法来制定基准不利于ZSG的进展,我们建议快速在线适应和解决与RL有关的特定问题,作为今后有关ZSG方法工作的一些领域。我们建议,在未得到充分探讨的问题环境中建立基准,例如离线的RL ZSG和奖励性功能变化。

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

MEKK1-MKK4-JNK1信号模块与HO-1的结合位点在神经炎症中的作用和机制

国家自然科学基金

0+阅读 · 2015年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

NSCs、BMSCs移植治疗锰中毒大鼠多巴胺能神经损伤分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

激素性股骨头坏死的新机制：内质网应激介导eNOS特异性位点去磷酸化

国家自然科学基金

0+阅读 · 2013年12月31日

三株海南特有红树植物内生真菌抗炭疽病真菌的化学成分研究

国家自然科学基金

0+阅读 · 2013年12月31日

选择性靶向拓扑异构酶IIa的ATPase位点抑制剂：新型曼宋酮类化合物设计、合成和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

带拟周期强迫的非线性Hamilton偏微分方程拟周期解的存在性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Keap1-Nrf2-ARE信号通路在花色苷诱导HO-1mRNA表达及抗氧化损伤中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

CIB1对脑缺血半暗带微血管作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Evolutionary Reinforcement Learning: A Survey

Arxiv

0+阅读 · 2023年3月10日

A Survey on Transformers in Reinforcement Learning

Arxiv

31+阅读 · 2023年1月8日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

44+阅读 · 2022年8月2日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

A Survey on Deep Reinforcement Learning for Data Processing and Analytics

Arxiv

24+阅读 · 2022年2月4日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

14+阅读 · 2021年9月8日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

VIP会员

文章信息

相关主题

深度强化学习

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Evolutionary Reinforcement Learning: A Survey

Arxiv

0+阅读 · 2023年3月10日

A Survey on Transformers in Reinforcement Learning

Arxiv

31+阅读 · 2023年1月8日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

44+阅读 · 2022年8月2日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

A Survey on Deep Reinforcement Learning for Data Processing and Analytics

Arxiv

24+阅读 · 2022年2月4日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

14+阅读 · 2021年9月8日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

相关基金

MEKK1-MKK4-JNK1信号模块与HO-1的结合位点在神经炎症中的作用和机制

国家自然科学基金

0+阅读 · 2015年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

NSCs、BMSCs移植治疗锰中毒大鼠多巴胺能神经损伤分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

激素性股骨头坏死的新机制：内质网应激介导eNOS特异性位点去磷酸化

国家自然科学基金

0+阅读 · 2013年12月31日

三株海南特有红树植物内生真菌抗炭疽病真菌的化学成分研究

国家自然科学基金

0+阅读 · 2013年12月31日

选择性靶向拓扑异构酶IIa的ATPase位点抑制剂：新型曼宋酮类化合物设计、合成和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

带拟周期强迫的非线性Hamilton偏微分方程拟周期解的存在性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Keap1-Nrf2-ARE信号通路在花色苷诱导HO-1mRNA表达及抗氧化损伤中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

CIB1对脑缺血半暗带微血管作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员