彩虹快速和数据有效培训:阿塔里实验研究 (Fast and Data-Efficient Training of Rainbow: an Experimental Study on Atari) - 专知论文

会员服务 ·

0

Rainbow · Performer · Atari · 可约的 · FAST ·

2021 年 11 月 19 日

Fast and Data-Efficient Training of Rainbow: an Experimental Study on Atari

翻译：彩虹快速和数据有效培训:阿塔里实验研究

Dominik Schmidt,Thomas Schmied

from arxiv, NeurIPS 2021, Deep Reinforcement Learning Workshop. Code at https://github.com/schmidtdominik/Rainbow

Across the Arcade Learning Environment, Rainbow achieves a level of performance competitive with humans and modern RL algorithms. However, attaining this level of performance requires large amounts of data and hardware resources, making research in this area computationally expensive and use in practical applications often infeasible. This paper's contribution is threefold: We (1) propose an improved version of Rainbow, seeking to drastically reduce Rainbow's data, training time, and compute requirements while maintaining its competitive performance; (2) we empirically demonstrate the effectiveness of our approach through experiments on the Arcade Learning Environment, and (3) we conduct a number of ablation studies to investigate the effect of the individual proposed modifications. Our improved version of Rainbow reaches a median human normalized score close to classic Rainbow's, while using 20 times less data and requiring only 7.5 hours of training time on a single GPU. We also provide our full implementation including pre-trained models.

翻译：在整个弧形学习环境中,彩虹取得了与人类和现代RL算法具有竞争力的业绩水平;然而,达到这一业绩水平需要大量的数据和硬件资源,使得这一领域的研究在计算上费用昂贵,而且往往无法实际应用,本文件的贡献有三重:(1) 提出改进版彩虹,力求大幅减少彩虹的数据、培训时间和计算要求,同时保持其竞争性业绩;(2) 我们通过在弧形学习环境中的实验,实证地展示了我们做法的有效性;(3) 我们进行了一系列的调整研究,以调查个别修改的影响。我们改进版彩虹达到接近经典彩虹的中位人平分,同时使用的数据减少了20倍,在单一的GPU上只需要7.5小时的培训时间。我们还提供了我们的全面实施,包括预先培训的模型。

0

相关内容

Rainbow

【ICML2021】无训练神经架构搜索

专知会员服务

20+阅读 · 2021年9月16日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

元学习(meta learning) 最新进展综述论文

元学习(meta learning) 最新进展综述论文

专知会员服务

281+阅读 · 2020年5月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【ACM Multimedia 2019 Tutorial】机器学习音频和多媒体数据的再现性和实验设计（Reproducibility and Experimental Design for Machine Learning on Audio and Multimedia Data），Gerald Friedland

【ACM Multimedia 2019 Tutorial】机器学习音频和多媒体数据的再现性和实验设计（Reproducibility and Experimental Design for Machine Learning on Audio and Multimedia Data），Gerald Friedland

专知会员服务

5+阅读 · 2019年11月18日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Exploration of Hyperdimensional Computing Strategies for Enhanced Learning on Epileptic Seizure Detection

Arxiv

0+阅读 · 2022年1月24日

Efficient and Robust Classification for Sparse Attacks

Arxiv

0+阅读 · 2022年1月23日

Cycle Self-Training for Domain Adaptation

Arxiv

8+阅读 · 2021年10月28日

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Arxiv

9+阅读 · 2021年6月16日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Weakly-Supervised Deep Learning for Domain Invariant Sentiment Classification

Arxiv

4+阅读 · 2019年10月29日

AutoML: A Survey of the State-of-the-Art

AutoML: A Survey of the State-of-the-Art

Arxiv

74+阅读 · 2019年8月14日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Causally Regularized Learning with Agnostic Data Selection Bias

Arxiv

6+阅读 · 2018年8月19日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2021】无训练神经架构搜索

专知会员服务

20+阅读 · 2021年9月16日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

元学习(meta learning) 最新进展综述论文

元学习(meta learning) 最新进展综述论文

专知会员服务

281+阅读 · 2020年5月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【ACM Multimedia 2019 Tutorial】机器学习音频和多媒体数据的再现性和实验设计（Reproducibility and Experimental Design for Machine Learning on Audio and Multimedia Data），Gerald Friedland

【ACM Multimedia 2019 Tutorial】机器学习音频和多媒体数据的再现性和实验设计（Reproducibility and Experimental Design for Machine Learning on Audio and Multimedia Data），Gerald Friedland

专知会员服务

5+阅读 · 2019年11月18日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

相关论文

Exploration of Hyperdimensional Computing Strategies for Enhanced Learning on Epileptic Seizure Detection

Arxiv

0+阅读 · 2022年1月24日

Efficient and Robust Classification for Sparse Attacks

Arxiv

0+阅读 · 2022年1月23日

Cycle Self-Training for Domain Adaptation

Arxiv

8+阅读 · 2021年10月28日

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Arxiv

9+阅读 · 2021年6月16日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Weakly-Supervised Deep Learning for Domain Invariant Sentiment Classification

Arxiv

4+阅读 · 2019年10月29日

AutoML: A Survey of the State-of-the-Art

AutoML: A Survey of the State-of-the-Art

Arxiv

74+阅读 · 2019年8月14日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Causally Regularized Learning with Agnostic Data Selection Bias

Arxiv

6+阅读 · 2018年8月19日

微信扫码咨询专知VIP会员