On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes - 专知论文

会员服务 ·

0

dynamic programming · Markov · 离散化 · Processing（编程语言） · 优化器 ·

2023 年 6 月 1 日

On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes

翻译：暂无翻译

Jia Lin Hau,Erick Delage,Mohammad Ghavamzadeh,Marek Petrik

Optimizing static risk-averse objectives in Markov decision processes is difficult because they do not admit standard dynamic programming equations common in Reinforcement Learning (RL) algorithms. Dynamic programming decompositions that augment the state space with discrete risk levels have recently gained popularity in the RL community. Prior work has shown that these decompositions are optimal when the risk level is discretized sufficiently. However, we show that these popular decompositions for Conditional-Value-at-Risk (CVaR) and Entropic-Value-at-Risk (EVaR) are inherently suboptimal regardless of the discretization level. In particular, we show that a saddle point property assumed to hold in prior literature may be violated. However, a decomposition does hold for Value-at-Risk and our proof demonstrates how this risk measure differs from CVaR and EVaR. Our findings are significant because risk-averse algorithms are used in high-stake environments, making their correctness much more critical.

翻译：暂无翻译

0

相关内容

dynamic programming

dynamic programming

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Decorin对急性缺血性卒中后血脑屏障中ZO-1蛋白的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Insulicolide A的全合成和结构优化

国家自然科学基金

0+阅读 · 2014年12月31日

向量优化问题的近似解的最优性条件

国家自然科学基金

0+阅读 · 2012年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

食管癌转移高风险性相关的SNP位点筛查研究

国家自然科学基金

0+阅读 · 2008年12月31日

Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems

Arxiv

0+阅读 · 2023年7月23日

An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems

Arxiv

0+阅读 · 2023年7月21日

From computational ethics to morality: how decision-making algorithms can help us understand the emergence of moral principles, the existence of an optimal behaviour and our ability to discover it

Arxiv

0+阅读 · 2023年7月20日

Markov Decision Processes with Time-Varying Geometric Discounting

Arxiv

0+阅读 · 2023年7月19日

Game-Theoretic and Machine Learning-based Approaches for Defensive Deception: A Survey

Arxiv

26+阅读 · 2021年1月21日

VIP会员

文章信息

相关主题

dynamic programming

Processing（编程语言）

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】面向真实世界音视联合语音识别的可扩展框架

《通过仿真与开源数据提升战略决策：机遇与局限》最新报告

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

评估大语言模型在科学发现中的作用

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems

Arxiv

0+阅读 · 2023年7月23日

An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems

Arxiv

0+阅读 · 2023年7月21日

From computational ethics to morality: how decision-making algorithms can help us understand the emergence of moral principles, the existence of an optimal behaviour and our ability to discover it

Arxiv

0+阅读 · 2023年7月20日

Markov Decision Processes with Time-Varying Geometric Discounting

Arxiv

0+阅读 · 2023年7月19日

Game-Theoretic and Machine Learning-based Approaches for Defensive Deception: A Survey

Arxiv

26+阅读 · 2021年1月21日

相关基金

Decorin对急性缺血性卒中后血脑屏障中ZO-1蛋白的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Insulicolide A的全合成和结构优化

国家自然科学基金

0+阅读 · 2014年12月31日

向量优化问题的近似解的最优性条件

国家自然科学基金

0+阅读 · 2012年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

食管癌转移高风险性相关的SNP位点筛查研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员