具有长程CVAR标准的风险敏感标记决定程序 (Risk-Sensitive Markov Decision Processes with Long-Run CVaR Criterion) - 专知论文

会员服务 ·

0

优化器 · Markov · 准则 · Processing（编程语言） · dynamic programming ·

2022 年 10 月 17 日

Risk-Sensitive Markov Decision Processes with Long-Run CVaR Criterion

翻译：具有长程CVAR标准的风险敏感标记决定程序

Li Xia,Peter W. Glynn

from arxiv, 33 pages, 7 figures, 4 tables. A risk-sensitive MDP methodology for optimizing long-run CVaR, which is extensive to data-driven learning scenarios

CVaR (Conditional Value at Risk) is a risk metric widely used in finance. However, dynamically optimizing CVaR is difficult since it is not a standard Markov decision process (MDP) and the principle of dynamic programming fails. In this paper, we study the infinite-horizon discrete-time MDP with a long-run CVaR criterion, from the view of sensitivity-based optimization. By introducing a pseudo CVaR metric, we derive a CVaR difference formula which quantifies the difference of long-run CVaR under any two policies. The optimality of deterministic policies is derived. We obtain a so-called Bellman local optimality equation for CVaR, which is a necessary and sufficient condition for local optimal policies and only necessary for global optimal policies. A CVaR derivative formula is also derived for providing more sensitivity information. Then we develop a policy iteration type algorithm to efficiently optimize CVaR, which is shown to converge to local optima in the mixed policy space. We further discuss some extensions including the mean-CVaR optimization and the maximization of CVaR. Finally, we conduct numerical experiments relating to portfolio management to demonstrate the main results. Our work may shed light on dynamically optimizing CVaR from a sensitivity viewpoint.

翻译：CVAR(风险条件值)是一种在金融中广泛使用的风险衡量标准。然而,动态优化 CVAR是困难的,因为它不是一个标准的Markov 决策程序(MDP)和动态编程原则失败。在本文中,我们从基于敏感性的优化角度,研究具有长期运行的 CVAR 标准的无限和离散 MDP(离散的MDP ) 。通过引入一个假的 CVAR 衡量标准,我们得出CVAR 差异公式,该公式可以量化长期运行的 CVAR 在任何两种政策下的差异。确定性政策的最佳性是衍生出来的。我们为CVAR 取得了所谓的贝尔曼本地最佳化方程式,这是当地最佳政策的一个必要和充分的条件,并且只是全球最佳政策所必要的。CVAR 衍生公式也可以用来提供更敏感的信息。然后,我们通过引入一种政策 Iteration 类型算法,以高效优化 CVAR,这在混合政策空间中显示会与本地的选制。我们进一步讨论了一些扩展,包括从中中的平均-CVAR 优化到我们的主要组合,最终可以展示我们动态的动态管理结果。

0

相关内容

优化器

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

禾谷镰孢菌Fusarium graminearum CYP51与DMIs类杀菌剂结合的分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

仿射技巧在复几何的应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于硅基一维纳米线的Gate-all-around纳米晶体管的研究

国家自然科学基金

0+阅读 · 2011年12月31日

光晶格中含有自旋轨道耦合的超冷原子的新奇量子效应及调控

国家自然科学基金

0+阅读 · 2011年12月31日

KP系列的对称约束与矩阵积分

国家自然科学基金

0+阅读 · 2009年12月31日

基于GPS与ICA技术的大跨柔性桥梁动态变形监测研究

国家自然科学基金

0+阅读 · 2009年12月31日

核受体及其配体对代谢综合征分子调控机制的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Toward a Fairness-Aware Scoring System for Algorithmic Decision-Making

Arxiv

0+阅读 · 2022年11月22日

Partial Tail-Correlation Coefficient Applied to Extremal-Network Learning

Arxiv

0+阅读 · 2022年11月22日

Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction

Arxiv

0+阅读 · 2022年11月22日

Variation-based Cause Effect Identification

Arxiv

0+阅读 · 2022年11月22日

Decision-making with Imaginary Opponent Models

Arxiv

0+阅读 · 2022年11月22日

Simultaneously Updating All Persistence Values in Reinforcement Learning

Arxiv

0+阅读 · 2022年11月21日

Noisy Symbolic Abstractions for Deep RL: A case study with Reward Machines

Arxiv

0+阅读 · 2022年11月20日

An application of Saddlepoint Approximation for period detection of stellar light observations

Arxiv

0+阅读 · 2022年11月18日

Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic Programming

Arxiv

0+阅读 · 2022年11月18日

Expert Selection in Distributed Gaussian Processes: A Multi-label Classification Approach

Arxiv

0+阅读 · 2022年11月17日

VIP会员

文章信息

相关主题

Processing（编程语言）

dynamic programming

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《军事目标选定中的自主武器系统与人工智能决策支持系统：比较与政策应对建议》34页报告

《陆军无人平台的安全指挥、控制与通信系统》报告

全域作战空间导引：引入“全地形规划”概念

《认知战：脑与行为科学的武器化》40页报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Toward a Fairness-Aware Scoring System for Algorithmic Decision-Making

Arxiv

0+阅读 · 2022年11月22日

Partial Tail-Correlation Coefficient Applied to Extremal-Network Learning

Arxiv

0+阅读 · 2022年11月22日

Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction

Arxiv

0+阅读 · 2022年11月22日

Variation-based Cause Effect Identification

Arxiv

0+阅读 · 2022年11月22日

Decision-making with Imaginary Opponent Models

Arxiv

0+阅读 · 2022年11月22日

Simultaneously Updating All Persistence Values in Reinforcement Learning

Arxiv

0+阅读 · 2022年11月21日

Noisy Symbolic Abstractions for Deep RL: A case study with Reward Machines

Arxiv

0+阅读 · 2022年11月20日

An application of Saddlepoint Approximation for period detection of stellar light observations

Arxiv

0+阅读 · 2022年11月18日

Autonomous Platoon Control with Integrated Deep Reinforcement Learning and Dynamic Programming

Arxiv

0+阅读 · 2022年11月18日

Expert Selection in Distributed Gaussian Processes: A Multi-label Classification Approach

Arxiv

0+阅读 · 2022年11月17日

相关基金

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

禾谷镰孢菌Fusarium graminearum CYP51与DMIs类杀菌剂结合的分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

仿射技巧在复几何的应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于硅基一维纳米线的Gate-all-around纳米晶体管的研究

国家自然科学基金

0+阅读 · 2011年12月31日

光晶格中含有自旋轨道耦合的超冷原子的新奇量子效应及调控

国家自然科学基金

0+阅读 · 2011年12月31日

KP系列的对称约束与矩阵积分

国家自然科学基金

0+阅读 · 2009年12月31日

基于GPS与ICA技术的大跨柔性桥梁动态变形监测研究

国家自然科学基金

0+阅读 · 2009年12月31日

核受体及其配体对代谢综合征分子调控机制的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员