Supplementing Gradient-Based Reinforcement Learning with Simple Evolutionary Ideas - 专知论文

会员服务 ·

0

Agent · SimPLe · Learning · 强化学习 · 协方差矩阵 ·

2023 年 5 月 10 日

Supplementing Gradient-Based Reinforcement Learning with Simple Evolutionary Ideas

翻译：暂无翻译

Harshad Khadilkar

from arxiv, 17 pages

We present a simple, sample-efficient algorithm for introducing large but directed learning steps in reinforcement learning (RL), through the use of evolutionary operators. The methodology uses a population of RL agents training with a common experience buffer, with occasional crossovers and mutations of the agents in order to search efficiently through the policy space. Unlike prior literature on combining evolutionary search (ES) with RL, this work does not generate a distribution of agents from a common mean and covariance matrix. Neither does it require the evaluation of the entire population of policies at every time step. Instead, we focus on gradient-based training throughout the life of every policy (individual), with a sparse amount of evolutionary exploration. The resulting algorithm is shown to be robust to hyperparameter variations. As a surprising corollary, we show that simply initialising and training multiple RL agents with a common memory (with no further evolutionary updates) outperforms several standard RL baselines.

翻译：暂无翻译

0

相关内容

Agent

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

ARB抑制miR-193a表达促进早期糖尿病肾病壁层上皮细胞-足细胞转分化研究

国家自然科学基金

0+阅读 · 2015年12月31日

沉默ARK5基因逆转乏氧诱导胃癌多药耐药的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

e-learning中基于学业表情的情绪认知分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

miR-126对CD4+CD25+调节性T细胞外周诱导的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Cherry-Picking with Reinforcement Learning : Robust Dynamic Grasping in Unstable Conditions

Arxiv

0+阅读 · 2023年6月28日

Replicable Reinforcement Learning

Arxiv

0+阅读 · 2023年6月27日

Supervised Pretraining Can Learn In-Context Reinforcement Learning

Arxiv

1+阅读 · 2023年6月26日

Active Coverage for PAC Reinforcement Learning

Arxiv

0+阅读 · 2023年6月23日

Reinforcement Learning-based Virtual Fixtures for Teleoperation of Hydraulic Construction Machine

Reinforcement Learning-based Virtual Fixtures for Teleoperation of Hydraulic Construction Machine

Arxiv

0+阅读 · 2023年6月23日

VIP会员

文章信息

相关主题

协方差矩阵

相关VIP内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

《北约认知战概念报告》

【MIT博士论文】高效的视觉合成生成模型

美海军放弃星座级转而采用国家安全巡逻舰设计

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

相关论文

Cherry-Picking with Reinforcement Learning : Robust Dynamic Grasping in Unstable Conditions

Arxiv

0+阅读 · 2023年6月28日

Replicable Reinforcement Learning

Arxiv

0+阅读 · 2023年6月27日

Supervised Pretraining Can Learn In-Context Reinforcement Learning

Arxiv

1+阅读 · 2023年6月26日

Active Coverage for PAC Reinforcement Learning

Arxiv

0+阅读 · 2023年6月23日

Reinforcement Learning-based Virtual Fixtures for Teleoperation of Hydraulic Construction Machine

Reinforcement Learning-based Virtual Fixtures for Teleoperation of Hydraulic Construction Machine

Arxiv

0+阅读 · 2023年6月23日

相关基金

ARB抑制miR-193a表达促进早期糖尿病肾病壁层上皮细胞-足细胞转分化研究

国家自然科学基金

0+阅读 · 2015年12月31日

沉默ARK5基因逆转乏氧诱导胃癌多药耐药的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

e-learning中基于学业表情的情绪认知分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

miR-126对CD4+CD25+调节性T细胞外周诱导的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员