为持续控制而指导的正常水平分配强化学习 (Normality-Guided Distributional Reinforcement Learning for Continuous Control) - 专知论文

会员服务 ·

0

Continuity · Learning · Performer · 控制器 · 价值函数 ·

2023 年 1 月 26 日

Normality-Guided Distributional Reinforcement Learning for Continuous Control

翻译：为持续控制而指导的正常水平分配强化学习

Ju-Seung Byun,Andrew Perrault

Learning a predictive model of the mean return, or value function, plays a critical role in many reinforcement learning algorithms. Distributional reinforcement learning (DRL) methods instead model the value distribution, which has been shown to improve performance in many settings. In this paper, we model the value distribution as approximately normal using the Markov Chain central limit theorem. We analytically compute quantile bars to provide a new DRL target that is informed by the decrease in standard deviation that occurs over the course of an episode. In addition, we propose a policy update strategy based on uncertainty as measured by structural characteristics of the value distribution not present in the standard value function. The approach we outline is compatible with many DRL structures. We use two representative on-policy algorithms, PPO and TRPO, as testbeds and show that our methods produce performance improvements in continuous control tasks.

翻译：学习平均回报或价值函数的预测模型在许多强化学习算法中发挥着关键作用。分配强化学习方法( DRL) 代替了价值分布模型, 显示它能改善许多环境的性能。在本文中, 我们用 Markov 链中枢限制理论, 将价值分布模式作为大致正常的模型。我们分析计算了四分点, 以提供新的 DRL 目标, 其依据是事件过程中标准偏差的下降。此外, 我们根据标准值函数中不存在的值分布结构特征所测量的不确定性, 提出了一个政策更新战略。我们所描述的方法与许多 DRL 结构相容。我们用两个政策算法的代表, PPO 和 TRPO 来作为测试台, 并显示我们的方法在连续控制任务中产生了绩效的改进。

0

相关内容

Continuity

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

面向X-CT应用的(Ce, Lu)3(Cr, Al)5O12闪烁陶瓷中过渡金属离子的光谱展宽效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

MicroRNA调控BACE1在AD发病中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

帕金森病遗传与环境因素对小胶质细胞的激活及机制

国家自然科学基金

0+阅读 · 2013年12月31日

陆面过程模式中碳-氮循环参数化方案耦合研究

国家自然科学基金

0+阅读 · 2013年12月31日

不确定耦合PDE-ODE系统的自适应镇定

国家自然科学基金

0+阅读 · 2013年12月31日

Fibulin-5/β1-integrin 信号通路在醛固酮诱导血管平滑肌细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

终止时间随机且折扣因子不确定的Markov控制过程

国家自然科学基金

0+阅读 · 2012年12月31日

叶酸对动脉粥样硬化表观遗传学作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧时HIF-1α转录激活自噬蛋白Beclin 1促进鼻咽癌转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

青蒿琥酯上调巨噬细胞抗菌性自噬作用的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Distributional Reinforcement Learning with Unconstrained Monotonic Neural Networks

Arxiv

0+阅读 · 2023年3月17日

Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年3月15日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Distributed Deep Reinforcement Learning: A Survey and A Multi-Player Multi-Agent Learning Toolbox

Arxiv

11+阅读 · 2022年12月1日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Distributional Reinforcement Learning with Unconstrained Monotonic Neural Networks

Arxiv

0+阅读 · 2023年3月17日

Latent-Conditioned Policy Gradient for Multi-Objective Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年3月15日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Distributed Deep Reinforcement Learning: A Survey and A Multi-Player Multi-Agent Learning Toolbox

Arxiv

11+阅读 · 2022年12月1日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

面向X-CT应用的(Ce, Lu)3(Cr, Al)5O12闪烁陶瓷中过渡金属离子的光谱展宽效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

MicroRNA调控BACE1在AD发病中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

帕金森病遗传与环境因素对小胶质细胞的激活及机制

国家自然科学基金

0+阅读 · 2013年12月31日

陆面过程模式中碳-氮循环参数化方案耦合研究

国家自然科学基金

0+阅读 · 2013年12月31日

不确定耦合PDE-ODE系统的自适应镇定

国家自然科学基金

0+阅读 · 2013年12月31日

Fibulin-5/β1-integrin 信号通路在醛固酮诱导血管平滑肌细胞凋亡中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

终止时间随机且折扣因子不确定的Markov控制过程

国家自然科学基金

0+阅读 · 2012年12月31日

叶酸对动脉粥样硬化表观遗传学作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧时HIF-1α转录激活自噬蛋白Beclin 1促进鼻咽癌转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

青蒿琥酯上调巨噬细胞抗菌性自噬作用的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员