以州为基地的邻国RL勘探 (Neighboring state-based RL Exploration) - 专知论文

会员服务 ·

0

DQN · 离散化 · Better · 回合 · Agent ·

2022 年 12 月 21 日

Neighboring state-based RL Exploration

翻译：以州为基地的邻国RL勘探

Jeffery Cheng,Kevin Li,Justin Lin,Pedro Pachuca

Reinforcement Learning is a powerful tool to model decision-making processes. However, it relies on an exploration-exploitation trade-off that remains an open challenge for many tasks. In this work, we study neighboring state-based, model-free exploration led by the intuition that, for an early-stage agent, considering actions derived from a bounded region of nearby states may lead to better actions when exploring. We propose two algorithms that choose exploratory actions based on a survey of nearby states, and find that one of our methods, ${\rho}$-explore, consistently outperforms the Double DQN baseline in an discrete environment by 49\% in terms of Eval Reward Return.

翻译：强化学习是模拟决策进程的有力工具。但是,它依赖于勘探-开发交易,对于许多任务来说,这依然是一个公开的挑战。在这项工作中,我们研究了以早期代理人为首的、以直觉为首的以州为基础的、无模式的探索,即对于早期代理人而言,考虑来自邻近各州交界地区的行动可能会在探索时导致更好的行动。我们提出了两种算法,根据对邻近各州的调查选择探索行动,并发现我们的方法之一,即$_rho}-exlore,在Evalward Return Return 方面,持续超过在离散环境中的双QN基线49 。

0

相关内容

DQN

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【ALT 2019 Tutorials】强化学习的探索性开发（Exploration-Exploitation in Reinforcement Learning）

【ALT 2019 Tutorials】强化学习的探索性开发（Exploration-Exploitation in Reinforcement Learning）

专知会员服务

34+阅读 · 2019年3月21日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

基于序贯集中式水下扩展阵浮标动态非线性系统融合技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

斜爆震波触发与驻定机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

PGRN抑制骨骼肌成肌分化作用机制的初步研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

植被缓冲带土壤水热特性对农业非点源污染物持留影响

国家自然科学基金

0+阅读 · 2009年12月31日

基于Meta-Agent交互链的作战系统建模研究

国家自然科学基金

8+阅读 · 2009年12月31日

拟南芥VSP蛋白的晶体结构和催化特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于多智能体强化学习的多机器人系统研究

国家自然科学基金

48+阅读 · 2009年12月31日

Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning

Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2023年2月21日

Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret

Arxiv

0+阅读 · 2023年2月21日

Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking

Arxiv

0+阅读 · 2023年2月21日

A Heuristic Autonomous Exploration Method Based on Environmental Information Gain During Quadrotor Flight

Arxiv

0+阅读 · 2023年2月21日

Probabilistic WCET Estimation for Weakly Hard Real-Time Systems

Arxiv

0+阅读 · 2023年2月20日

Kernelizing Temporal Exploration Problems

Arxiv

0+阅读 · 2023年2月20日

Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot

Arxiv

0+阅读 · 2023年2月20日

Gibbs sampling for mixtures in order of appearance: the ordered allocation sampler

Arxiv

0+阅读 · 2023年2月20日

On Explainability of Graph Neural Networks via Subgraph Explorations

Arxiv

11+阅读 · 2021年5月31日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【ALT 2019 Tutorials】强化学习的探索性开发（Exploration-Exploitation in Reinforcement Learning）

【ALT 2019 Tutorials】强化学习的探索性开发（Exploration-Exploitation in Reinforcement Learning）

专知会员服务

34+阅读 · 2019年3月21日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning

Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning

Arxiv

0+阅读 · 2023年2月21日

Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret

Arxiv

0+阅读 · 2023年2月21日

Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking

Arxiv

0+阅读 · 2023年2月21日

A Heuristic Autonomous Exploration Method Based on Environmental Information Gain During Quadrotor Flight

Arxiv

0+阅读 · 2023年2月21日

Probabilistic WCET Estimation for Weakly Hard Real-Time Systems

Arxiv

0+阅读 · 2023年2月20日

Kernelizing Temporal Exploration Problems

Arxiv

0+阅读 · 2023年2月20日

Demonstration-Guided Reinforcement Learning with Efficient Exploration for Task Automation of Surgical Robot

Arxiv

0+阅读 · 2023年2月20日

Gibbs sampling for mixtures in order of appearance: the ordered allocation sampler

Arxiv

0+阅读 · 2023年2月20日

On Explainability of Graph Neural Networks via Subgraph Explorations

Arxiv

11+阅读 · 2021年5月31日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

相关基金

基于序贯集中式水下扩展阵浮标动态非线性系统融合技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

斜爆震波触发与驻定机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

PGRN抑制骨骼肌成肌分化作用机制的初步研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

植被缓冲带土壤水热特性对农业非点源污染物持留影响

国家自然科学基金

0+阅读 · 2009年12月31日

基于Meta-Agent交互链的作战系统建模研究

国家自然科学基金

8+阅读 · 2009年12月31日

拟南芥VSP蛋白的晶体结构和催化特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于多智能体强化学习的多机器人系统研究

国家自然科学基金

48+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员