On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures - 专知论文

会员服务 ·

0

条件风险 · 控制器 · Performer · ENJOY · 有向 ·

2023 年 5 月 30 日

On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures

翻译：暂无翻译

Xian Yu,Lei Ying

Risk-sensitive reinforcement learning (RL) has become a popular tool to control the risk of uncertain outcomes and ensure reliable performance in various sequential decision-making problems. While policy gradient methods have been developed for risk-sensitive RL, it remains unclear if these methods enjoy the same global convergence guarantees as in the risk-neutral case. In this paper, we consider a class of dynamic time-consistent risk measures, called Expected Conditional Risk Measures (ECRMs), and derive policy gradient updates for ECRM-based objective functions. Under both constrained direct parameterization and unconstrained softmax parameterization, we provide global convergence and iteration complexities of the corresponding risk-averse policy gradient algorithms. We further test risk-averse variants of REINFORCE and actor-critic algorithms to demonstrate the efficacy of our method and the importance of risk control.

翻译：暂无翻译

0

相关内容

条件风险

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

南大《优化方法（Optimization Methods》课程，推荐！

南大《优化方法（Optimization Methods》课程，推荐！

专知会员服务

80+阅读 · 2022年4月3日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

针刺、语言任务干预卒中后运动性失语的fMRI/ERP双模态脑网络效应机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

脂联素线粒体稳态调节在糖尿病缺血心肌保护中的作用及关键分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

老年人视觉方位、方向辨别能力衰退的神经机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

超短超强激光驱动的高亮度Betatron辐射光源

国家自然科学基金

1+阅读 · 2013年12月31日

网络环境下非线性时变随机系统的最优递推滤波研究

国家自然科学基金

0+阅读 · 2013年12月31日

中缝背核中Ca2+和L-、T-及N-型钙通道对睡眠-觉醒的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

从头设计蛋白质DS119折叠机制的分子模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

具选择功能的分布式合作控制系统

国家自然科学基金

0+阅读 · 2011年12月31日

GATA-4、MEF2A表达调控与失重致心肌细胞凋亡的相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

Arxiv

0+阅读 · 2023年7月19日

Hierarchically Composing Level Generators for the Creation of Complex Structures

Arxiv

0+阅读 · 2023年7月19日

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Arxiv

0+阅读 · 2023年7月18日

Stability and Generalization of Stochastic Optimization with Nonconvex and Nonsmooth Problems

Arxiv

0+阅读 · 2023年7月18日

An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient

Arxiv

0+阅读 · 2023年7月17日

Understanding Best Subset Selection: A Tale of Two C(omplex)ities

Arxiv

0+阅读 · 2023年7月17日

Robust empirical risk minimization via Newton's method

Arxiv

0+阅读 · 2023年7月17日

A subgradient method with constant step-size for $\ell_1$-composite optimization

Arxiv

0+阅读 · 2023年7月17日

Efficient numerical method for multi-term time-fractional diffusion equations with Caputo-Fabrizio derivatives

Arxiv

0+阅读 · 2023年7月16日

Identifiability Guarantees for Causal Disentanglement from Soft Interventions

Arxiv

0+阅读 · 2023年7月13日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

南大《优化方法（Optimization Methods》课程，推荐！

南大《优化方法（Optimization Methods》课程，推荐！

专知会员服务

80+阅读 · 2022年4月3日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《代码、指挥与冲突：描绘军事人工智能的未来》报告

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

Arxiv

0+阅读 · 2023年7月19日

Hierarchically Composing Level Generators for the Creation of Complex Structures

Arxiv

0+阅读 · 2023年7月19日

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Arxiv

0+阅读 · 2023年7月18日

Stability and Generalization of Stochastic Optimization with Nonconvex and Nonsmooth Problems

Arxiv

0+阅读 · 2023年7月18日

An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient

Arxiv

0+阅读 · 2023年7月17日

Understanding Best Subset Selection: A Tale of Two C(omplex)ities

Arxiv

0+阅读 · 2023年7月17日

Robust empirical risk minimization via Newton's method

Arxiv

0+阅读 · 2023年7月17日

A subgradient method with constant step-size for $\ell_1$-composite optimization

Arxiv

0+阅读 · 2023年7月17日

Efficient numerical method for multi-term time-fractional diffusion equations with Caputo-Fabrizio derivatives

Arxiv

0+阅读 · 2023年7月16日

Identifiability Guarantees for Causal Disentanglement from Soft Interventions

Arxiv

0+阅读 · 2023年7月13日

相关基金

针刺、语言任务干预卒中后运动性失语的fMRI/ERP双模态脑网络效应机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

脂联素线粒体稳态调节在糖尿病缺血心肌保护中的作用及关键分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

老年人视觉方位、方向辨别能力衰退的神经机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

超短超强激光驱动的高亮度Betatron辐射光源

国家自然科学基金

1+阅读 · 2013年12月31日

网络环境下非线性时变随机系统的最优递推滤波研究

国家自然科学基金

0+阅读 · 2013年12月31日

中缝背核中Ca2+和L-、T-及N-型钙通道对睡眠-觉醒的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

从头设计蛋白质DS119折叠机制的分子模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

具选择功能的分布式合作控制系统

国家自然科学基金

0+阅读 · 2011年12月31日

GATA-4、MEF2A表达调控与失重致心肌细胞凋亡的相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员