连续化的情节控制 (Continuous Episodic Control) - 专知论文

会员服务 ·

0

非参数 · 离散 · 中国电子信息产业集团有限公司 · 参数计算 · 序贯决策 ·

2023 年 4 月 20 日

Continuous Episodic Control

翻译：连续化的情节控制

Zhao Yang,Thomas M. Moerland,Mike Preuss,Aske Plaat

Non-parametric episodic memory can be used to quickly latch onto high-rewarded experience in reinforcement learning tasks. In contrast to parametric deep reinforcement learning approaches in which reward signals need to be back-propagated slowly, these methods only need to discover the solution once, and may then repeatedly solve the task. However, episodic control solutions are stored in discrete tables, and this approach has so far only been applied to discrete action space problems. Therefore, this paper introduces Continuous Episodic Control (CEC), a novel non-parametric episodic memory algorithm for sequential decision making in problems with a continuous action space. Results on several sparse-reward continuous control environments show that our proposed method learns faster than state-of-the-art model-free RL and memory-augmented RL algorithms, while maintaining good long-run performance as well. In short, CEC can be a fast approach for learning in continuous control tasks.

翻译：非参数计算的情节性记忆可用于强化学习任务中迅速捕捉到高奖励的经验。与参数深度强化学习方法相比，其中奖励信号需要缓慢地进行反向传播，这些方法只需要发现解决方案一次，然后就可以重复解决该任务。然而，情节控制的解决方案存储在离散表格中，这种方法迄今仅应用于离散行动空间问题。因此，本文引入了连续化的情节控制（CEC），一种新颖的非参数性情节性记忆算法，用于处理具有连续动作空间的序贯决策问题。在几个稀疏奖励的连续控制环境中的结果表明，我们提出的方法比最先进的无模型 RL 和记忆增强 RL 算法学习更快，同时保持良好的长期性能。简而言之，CEC 可以成为连续控制任务中快速学习的一种方法。

0

相关内容

非参数

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

专知会员服务

16+阅读 · 2022年3月29日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

【2021新书】国际象棋神经网络，268页pdf

专知会员服务

31+阅读 · 2021年10月4日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020-上海交大】多智能体确定性Q-Learning， Multi-Agent Determinantal Q-Learning

【ICML2020-上海交大】多智能体确定性Q-Learning， Multi-Agent Determinantal Q-Learning

专知会员服务

38+阅读 · 2020年6月3日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

两类分数阶微分方程有效数值计算方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

复合材料里电磁问题的有限元方法

国家自然科学基金

1+阅读 · 2015年12月31日

随机双曲型偏微分方程的控制和观测

国家自然科学基金

0+阅读 · 2014年12月31日

控制方向未知的随机非线性系统的神经网络自适应控制

国家自然科学基金

2+阅读 · 2013年12月31日

对流扩散最优控制问题的有限元算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

不确定耦合PDE-ODE系统的自适应镇定

国家自然科学基金

0+阅读 · 2013年12月31日

随机偏微分方程快速高精度算法

国家自然科学基金

0+阅读 · 2012年12月31日

UTMD"可逆开闭"血视网膜屏障联合rAAV-MERTK治疗视网膜色素变性的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

保险中的逐段决定马氏过程控制理论

国家自然科学基金

0+阅读 · 2009年12月31日

BackpropTools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control

Arxiv

0+阅读 · 2023年6月6日

Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

Arxiv

0+阅读 · 2023年6月5日

Continuous Cartesian Genetic Programming based representation for Multi-Objective Neural Architecture Search

Arxiv

0+阅读 · 2023年6月5日

Online Continuous Hyperparameter Optimization for Contextual Bandits

Arxiv

0+阅读 · 2023年6月2日

Policy Optimization for Continuous Reinforcement Learning

Arxiv

0+阅读 · 2023年6月2日

Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learning

Arxiv

0+阅读 · 2023年6月1日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Arxiv

11+阅读 · 2021年3月14日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

中国电子信息产业集团有限公司

相关VIP内容

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

专知会员服务

16+阅读 · 2022年3月29日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

【2021新书】国际象棋神经网络，268页pdf

专知会员服务

31+阅读 · 2021年10月4日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020-上海交大】多智能体确定性Q-Learning， Multi-Agent Determinantal Q-Learning

【ICML2020-上海交大】多智能体确定性Q-Learning， Multi-Agent Determinantal Q-Learning

专知会员服务

38+阅读 · 2020年6月3日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

BackpropTools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control

Arxiv

0+阅读 · 2023年6月6日

Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

Arxiv

0+阅读 · 2023年6月5日

Continuous Cartesian Genetic Programming based representation for Multi-Objective Neural Architecture Search

Arxiv

0+阅读 · 2023年6月5日

Online Continuous Hyperparameter Optimization for Contextual Bandits

Arxiv

0+阅读 · 2023年6月2日

Policy Optimization for Continuous Reinforcement Learning

Arxiv

0+阅读 · 2023年6月2日

Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learning

Arxiv

0+阅读 · 2023年6月1日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Arxiv

11+阅读 · 2021年3月14日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

相关基金

两类分数阶微分方程有效数值计算方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

复合材料里电磁问题的有限元方法

国家自然科学基金

1+阅读 · 2015年12月31日

随机双曲型偏微分方程的控制和观测

国家自然科学基金

0+阅读 · 2014年12月31日

控制方向未知的随机非线性系统的神经网络自适应控制

国家自然科学基金

2+阅读 · 2013年12月31日

对流扩散最优控制问题的有限元算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

不确定耦合PDE-ODE系统的自适应镇定

国家自然科学基金

0+阅读 · 2013年12月31日

随机偏微分方程快速高精度算法

国家自然科学基金

0+阅读 · 2012年12月31日

UTMD"可逆开闭"血视网膜屏障联合rAAV-MERTK治疗视网膜色素变性的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

保险中的逐段决定马氏过程控制理论

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员