以州为基地的邻国RL勘探 (Neighboring state-based RL Exploration) - 专知论文

会员服务 ·

0

DQN · 离散化 · Better · 回合 · Agent ·

2023 年 2 月 5 日

Neighboring state-based RL Exploration

翻译：以州为基地的邻国RL勘探

Jeffery Cheng,Kevin Li,Justin Lin,Pedro Pachuca

Reinforcement Learning is a powerful tool to model decision-making processes. However, it relies on an exploration-exploitation trade-off that remains an open challenge for many tasks. In this work, we study neighboring state-based, model-free exploration led by the intuition that, for an early-stage agent, considering actions derived from a bounded region of nearby states may lead to better actions when exploring. We propose two algorithms that choose exploratory actions based on a survey of nearby states, and find that one of our methods, ${\rho}$-explore, consistently outperforms the Double DQN baseline in an discrete environment by 49\% in terms of Eval Reward Return.

翻译：强化学习是模拟决策进程的有力工具。但是,它依赖于勘探-开发交易,对于许多任务来说,这依然是一个公开的挑战。在这项工作中,我们研究了以早期代理人为首的、以直觉为首的以州为基础的、无模式的探索,即对于早期代理人而言,考虑来自邻近各州交界地区的行动可能会在探索时导致更好的行动。我们提出了两种算法,根据对邻近各州的调查选择探索行动,并发现我们的方法之一,即$_rho}-exlore,在Evalward Return Return 方面,持续超过在离散环境中的双QN基线49 。

0

相关内容

DQN

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

专知会员服务

79+阅读 · 2022年12月11日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

二氧化碳入侵包气带的温度效应与模拟

国家自然科学基金

0+阅读 · 2013年12月31日

可调度友好风电场有功功率控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高炉炉顶煤气中HCl气体脱除的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于铜配合物/石墨烯-聚苯胺修饰电极的无标记DNA电化学传感器的基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

掺杂超冷原子气体的低能激发

国家自然科学基金

0+阅读 · 2012年12月31日

金属吸附层对纳米金属/Si接触体系电学特性的调控

国家自然科学基金

0+阅读 · 2012年12月31日

一维纳晶金属氧化物传感器检测环境有毒气体及机制

国家自然科学基金

0+阅读 · 2011年12月31日

高流速下缓蚀剂在H2S和CO2环境中的构效关系

国家自然科学基金

0+阅读 · 2011年12月31日

树脂基点阵复合材料多层结构的力学性能表征与构型设计研究

国家自然科学基金

0+阅读 · 2009年12月31日

内嵌富勒烯的瞬态高压合成及其机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

Variational Distribution Learning for Unsupervised Text-to-Image Generation

Variational Distribution Learning for Unsupervised Text-to-Image Generation

Arxiv

0+阅读 · 2023年3月28日

Expert Kaplan--Meier estimation

Arxiv

0+阅读 · 2023年3月27日

A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets

Arxiv

0+阅读 · 2023年3月26日

Exploring Novel Quality Diversity Methods For Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2023年3月26日

Adjusting for informative cluster size in pseudo-value based regression approaches with clustered time to event data

Arxiv

0+阅读 · 2023年3月23日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

91+阅读 · 2022年1月14日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

VIP会员

文章信息

相关主题

相关VIP内容

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

【硬核书】深度强化学习实践手册：应用现代RL方法，包括深度Q网络、值迭代、策略梯度、TRPO、AlphaGo等，547页pdf

专知会员服务

79+阅读 · 2022年12月11日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

《军事行动中的人机协同共同学习》2025最新文献

代理式人工智能时代的决策优势

《F/A-18机队替换中队仿真模型的设计与分析》2025最新73页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Variational Distribution Learning for Unsupervised Text-to-Image Generation

Variational Distribution Learning for Unsupervised Text-to-Image Generation

Arxiv

0+阅读 · 2023年3月28日

Expert Kaplan--Meier estimation

Arxiv

0+阅读 · 2023年3月27日

A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets

Arxiv

0+阅读 · 2023年3月26日

Exploring Novel Quality Diversity Methods For Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2023年3月26日

Adjusting for informative cluster size in pseudo-value based regression approaches with clustered time to event data

Arxiv

0+阅读 · 2023年3月23日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

91+阅读 · 2022年1月14日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

相关基金

二氧化碳入侵包气带的温度效应与模拟

国家自然科学基金

0+阅读 · 2013年12月31日

可调度友好风电场有功功率控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高炉炉顶煤气中HCl气体脱除的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于铜配合物/石墨烯-聚苯胺修饰电极的无标记DNA电化学传感器的基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

掺杂超冷原子气体的低能激发

国家自然科学基金

0+阅读 · 2012年12月31日

金属吸附层对纳米金属/Si接触体系电学特性的调控

国家自然科学基金

0+阅读 · 2012年12月31日

一维纳晶金属氧化物传感器检测环境有毒气体及机制

国家自然科学基金

0+阅读 · 2011年12月31日

高流速下缓蚀剂在H2S和CO2环境中的构效关系

国家自然科学基金

0+阅读 · 2011年12月31日

树脂基点阵复合材料多层结构的力学性能表征与构型设计研究

国家自然科学基金

0+阅读 · 2009年12月31日

内嵌富勒烯的瞬态高压合成及其机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员