合成人经验重现</s> (Synthetic Experience Replay) - 专知论文

会员服务 ·

0

经验回放 · Agent · Networking · Learning · 上采样 ·

2023 年 3 月 12 日

Synthetic Experience Replay

翻译：合成人经验重现

Cong Lu,Philip J. Ball,Jack Parker-Holder

A key theme in the past decade has been that when large neural networks and large datasets combine they can produce remarkable results. In deep reinforcement learning (RL), this paradigm is commonly made possible through experience replay, whereby a dataset of past experiences is used to train a policy or value function. However, unlike in supervised or self-supervised learning, an RL agent has to collect its own data, which is often limited. Thus, it is challenging to reap the benefits of deep learning, and even small neural networks can overfit at the start of training. In this work, we leverage the tremendous recent progress in generative modeling and propose Synthetic Experience Replay (SynthER), a diffusion-based approach to arbitrarily upsample an agent's collected experience. We show that SynthER is an effective method for training RL agents across offline and online settings. In offline settings, we observe drastic improvements both when upsampling small offline datasets and when training larger networks with additional synthetic data. Furthermore, SynthER enables online agents to train with a much higher update-to-data ratio than before, leading to a large increase in sample efficiency, without any algorithmic changes. We believe that synthetic training data could open the door to realizing the full potential of deep learning for replay-based RL algorithms from limited data.

翻译：过去十年的一个关键主题是,当大型神经网络和大型数据集合并起来时,当大型神经网络和大型数据集能够产生显著的成果时,它们就会产生显著的成果。在深层强化学习(RL)中,这种范例通常是通过经验重放而得以实现的,即利用过去经验的数据集来培训政策或价值功能。然而,与监督或自我监督的学习不同,RL代理必须收集自己的数据,而这些数据往往有限。因此,在离线环境中,获取深层学习的好处,甚至小型神经网络在培训开始时也能过度使用。此外,SynthER使在线代理能够利用最近在基因模型建模方面的巨大进展,并提出合成经验重现(Synther),这是对任意上层一个抽样的推广方法,用来培训一个代理人所收集的经验。我们表明,SynthER是培训RL代理的有效方法,在离线和在线环境中,当采集小型离线数据集时,以及在培训更多的合成数据时,我们观察到了巨大的改进。此外,SynthER使在线代理能够用比以前高得多的更新到数据比数据的比例,从而使得我们能够全面学习任何大型的模型数据。</s>

0

相关内容

经验回放

148页最新《深度强化学习》教程，148页ppt

148页最新《深度强化学习》教程，148页ppt

专知会员服务

77+阅读 · 2023年4月29日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

PDGF-BB-LDH纳米载药PLCL/Pluronic在糖尿病创面愈合中的作用及分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

猪肌内脂肪沉积过程中miRNA调控机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

LncRNA SNHG11在乙肝相关性肝癌发生发展中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

FeS2/SiC光阴极的设计组装及其高效光电催化分解水制氢性能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

不同地形特征下坡面氮磷迁移的物理过程解析与模拟

国家自然科学基金

0+阅读 · 2013年12月31日

用于光电化学电池的一维Si纳米结构复合光电极研究

国家自然科学基金

0+阅读 · 2013年12月31日

堆肥菌株Aspergillus fumigatus Z5纤维素酶转录限制因子creA基因的克隆及其功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

硅基径向核壳纳米线太阳电池的物理与器件研究

国家自然科学基金

0+阅读 · 2011年12月31日

乙烯反应转录因子OsERF2调控水稻根发育的分子基础

国家自然科学基金

0+阅读 · 2011年12月31日

Map-based Experience Replay: A Memory-Efficient Solution to Catastrophic Forgetting in Reinforcement Learning

Arxiv

0+阅读 · 2023年5月3日

Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol Particles for Frontier Exploration

Arxiv

0+阅读 · 2023年4月27日

Learning Soft Constraints From Constrained Expert Demonstrations

Arxiv

0+阅读 · 2023年4月27日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

Arxiv

12+阅读 · 2021年2月21日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

VIP会员

文章信息

相关主题

相关VIP内容

148页最新《深度强化学习》教程，148页ppt

148页最新《深度强化学习》教程，148页ppt

专知会员服务

77+阅读 · 2023年4月29日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Map-based Experience Replay: A Memory-Efficient Solution to Catastrophic Forgetting in Reinforcement Learning

Arxiv

0+阅读 · 2023年5月3日

Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol Particles for Frontier Exploration

Arxiv

0+阅读 · 2023年4月27日

Learning Soft Constraints From Constrained Expert Demonstrations

Arxiv

0+阅读 · 2023年4月27日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

Arxiv

12+阅读 · 2021年2月21日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

相关基金

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

PDGF-BB-LDH纳米载药PLCL/Pluronic在糖尿病创面愈合中的作用及分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

猪肌内脂肪沉积过程中miRNA调控机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

LncRNA SNHG11在乙肝相关性肝癌发生发展中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

FeS2/SiC光阴极的设计组装及其高效光电催化分解水制氢性能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

不同地形特征下坡面氮磷迁移的物理过程解析与模拟

国家自然科学基金

0+阅读 · 2013年12月31日

用于光电化学电池的一维Si纳米结构复合光电极研究

国家自然科学基金

0+阅读 · 2013年12月31日

堆肥菌株Aspergillus fumigatus Z5纤维素酶转录限制因子creA基因的克隆及其功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

硅基径向核壳纳米线太阳电池的物理与器件研究

国家自然科学基金

0+阅读 · 2011年12月31日

乙烯反应转录因子OsERF2调控水稻根发育的分子基础

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员