强化学习的Choquet正规化 (Choquet regularization for reinforcement learning) - 专知论文

会员服务 ·

0

正则化项 · Learning · 微分熵 · 强化学习 · JMLR ·

2022 年 8 月 17 日

Choquet regularization for reinforcement learning

翻译：强化学习的Choquet正规化

Xia Han,Ruodu Wang,Xun Yu Zhou

We propose \emph{Choquet regularizers} to measure and manage the level of exploration for reinforcement learning (RL), and reformulate the continuous-time entropy-regularized RL problem of Wang et al. (2020, JMLR, 21(198)) in which we replace the differential entropy used for regularization with a Choquet regularizer. We derive the Hamilton--Jacobi--Bellman equation of the problem, and solve it explicitly in the linear--quadratic (LQ) case via maximizing statically a mean--variance constrained Choquet regularizer. Under the LQ setting, we derive explicit optimal distributions for several specific Choquet regularizers, and conversely identify the Choquet regularizers that generate a number of broadly used exploratory samplers such as $\epsilon$-greedy, exponential, uniform and Gaussian.

翻译：我们建议 \ emph{ Choque 管理器} 测量和管理强化学习的勘探水平( RL ), 重新配置Wang et al. ( 2020, JMLR, 21 (198)) 的连续对流的RL 问题, 以此用Choquet 常规化器取代用于正规化的差异的 entropy 。我们从这个问题的汉密尔顿- Jacobi- Bellman 方程式中提取问题, 并通过静态地最大化平均逆差限制Choquet 常规化器( LQ) 来明确解决线性- 问题。在 LQ 设置下, 我们为几个特定的 Choquet 常规化器获得明确的最佳分配, 并反过来识别 Choquet 的常规化器, 产生大量广泛使用的探索性采样器, 如 $\ psilon- greedy, 指数、制服和高斯。

0

相关内容

正则化项

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【电子书推荐】机器学习导论Introduction to Machine Learning，斯坦福大学 | Nils J. Nilsson

【电子书推荐】机器学习导论Introduction to Machine Learning，斯坦福大学 | Nils J. Nilsson

专知会员服务

46+阅读 · 2019年11月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

不确定分数阶非线性系统Mittag-Leffler自适应控制

国家自然科学基金

1+阅读 · 2016年12月31日

含执行器死区/滞环非线性系统的模糊自适应容错控制

国家自然科学基金

0+阅读 · 2015年12月31日

非凸稀疏正则化模型与算法的研究

国家自然科学基金

3+阅读 · 2015年12月31日

混流式水轮机调节系统的非线性有限时间控制

国家自然科学基金

0+阅读 · 2015年12月31日

针刺调节HPA轴预防性治疗原发性痛经的脑机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

CITED2在心脏干细胞衰老中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Archimedean三角模的区间犹豫模糊平均型集结算子及其在决策中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

再生核希尔伯特空间中自适应滤波新方法及应用

国家自然科学基金

0+阅读 · 2012年12月31日

DEC1、DEC2对人乳腺癌细胞衰老的调控作用及其作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

Lyapunov Function Consistent Adaptive Network Signal Control with Back Pressure and Reinforcement Learning

Arxiv

0+阅读 · 2022年10月6日

Learning convergence prediction of astrobots in multi-object spectrographs

Arxiv

0+阅读 · 2022年10月5日

Neural Distillation as a State Representation Bottleneck in Reinforcement Learning

Arxiv

0+阅读 · 2022年10月5日

Hierarchical Adversarial Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年10月5日

Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

Arxiv

0+阅读 · 2022年10月3日

Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation

Arxiv

0+阅读 · 2022年10月3日

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Arxiv

1+阅读 · 2022年9月30日

Safe Exploration Method for Reinforcement Learning under Existence of Disturbance

Arxiv

0+阅读 · 2022年9月30日

Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives

Arxiv

0+阅读 · 2022年9月30日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

91+阅读 · 2022年1月14日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【电子书推荐】机器学习导论Introduction to Machine Learning，斯坦福大学 | Nils J. Nilsson

【电子书推荐】机器学习导论Introduction to Machine Learning，斯坦福大学 | Nils J. Nilsson

专知会员服务

46+阅读 · 2019年11月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Lyapunov Function Consistent Adaptive Network Signal Control with Back Pressure and Reinforcement Learning

Arxiv

0+阅读 · 2022年10月6日

Learning convergence prediction of astrobots in multi-object spectrographs

Arxiv

0+阅读 · 2022年10月5日

Neural Distillation as a State Representation Bottleneck in Reinforcement Learning

Arxiv

0+阅读 · 2022年10月5日

Hierarchical Adversarial Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年10月5日

Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

Arxiv

0+阅读 · 2022年10月3日

Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation

Arxiv

0+阅读 · 2022年10月3日

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Arxiv

1+阅读 · 2022年9月30日

Safe Exploration Method for Reinforcement Learning under Existence of Disturbance

Arxiv

0+阅读 · 2022年9月30日

Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives

Arxiv

0+阅读 · 2022年9月30日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

91+阅读 · 2022年1月14日

相关基金

不确定分数阶非线性系统Mittag-Leffler自适应控制

国家自然科学基金

1+阅读 · 2016年12月31日

含执行器死区/滞环非线性系统的模糊自适应容错控制

国家自然科学基金

0+阅读 · 2015年12月31日

非凸稀疏正则化模型与算法的研究

国家自然科学基金

3+阅读 · 2015年12月31日

混流式水轮机调节系统的非线性有限时间控制

国家自然科学基金

0+阅读 · 2015年12月31日

针刺调节HPA轴预防性治疗原发性痛经的脑机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

CITED2在心脏干细胞衰老中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Archimedean三角模的区间犹豫模糊平均型集结算子及其在决策中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

再生核希尔伯特空间中自适应滤波新方法及应用

国家自然科学基金

0+阅读 · 2012年12月31日

DEC1、DEC2对人乳腺癌细胞衰老的调控作用及其作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员