扭曲风险措施的近似梯度提高率方法 (Approximate gradient ascent methods for distortion risk measures) - 专知论文

会员服务 ·

0

DRM · 梯度上升 · 梯度上升法 · 估计/估计量 · 近似 ·

2022 年 2 月 22 日

Approximate gradient ascent methods for distortion risk measures

翻译：扭曲风险措施的近似梯度提高率方法

Nithia Vijayan,Prashanth L. A

from arxiv, arXiv admin note: text overlap with arXiv:2107.04422

We propose approximate gradient ascent algorithms for risk-sensitive reinforcement learning control problem in on-policy as well as off-policy settings. We consider episodic Markov decision processes, and model the risk using distortion risk measure (DRM) of the cumulative discounted reward. Our algorithms estimate the DRM using order statistics of the cumulative rewards, and calculate approximate gradients from the DRM estimates using a smoothed functional-based gradient estimation scheme. We derive non-asymptotic bounds that establish the convergence of our proposed algorithms to an approximate stationary point of the DRM objective.

翻译：我们提出在政策和非政策环境中对风险敏感强化学习控制问题采用近似梯度乘法。我们考虑附带的马尔科夫决策程序,并用累积折扣奖励的扭曲风险计量(DRM)来模拟风险。我们的算法使用累积奖励的定序统计来估计DRM, 并使用平滑的功能梯度估计办法从DRM估计数中计算出大约的梯度。我们得出了确定我们拟议算法与DRM目标大致固定点趋同的非简易界限。

0

相关内容

DRM

DRM：ACM Workshop on Digital Rights Management。 Explanation：数码版权管理研讨会。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/drm/

【深度神经网络加速器的硬件近似技术综述】Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

【深度神经网络加速器的硬件近似技术综述】Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

专知会员服务

16+阅读 · 2022年3月17日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

ICLR 2020 高质量强化学习论文汇总

ICLR 2020 高质量强化学习论文汇总

极市平台

12+阅读 · 2019年11月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇图像检索相关论文—多模态反馈、二值约束深度哈希、绘制草图、对话交互式、多目标图像检索

【论文推荐】最新六篇图像检索相关论文—多模态反馈、二值约束深度哈希、绘制草图、对话交互式、多目标图像检索

专知

14+阅读 · 2018年6月11日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

以PqsR为靶点筛选铜绿假单胞菌群体感应调控抑制剂及联合用药研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于前馈环研究肝癌miRNA和转录因子的基因表达调控作用

国家自然科学基金

0+阅读 · 2013年12月31日

eCB介导的DSI效应在麻醉-觉醒调节中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

利用高通量RNAi技术研究肺鳞癌EGFR-TKI原发性耐药机制

国家自然科学基金

0+阅读 · 2012年12月31日

c-Abl基因缺失与PrPSc诱导神经元细胞氧化应激机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于单体型的基因统计关联分析

国家自然科学基金

0+阅读 · 2012年12月31日

面向海量超高维数据的随机森林算法理论及优化方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

针刺长强对FMRX1基因敲除小鼠突触可塑性影响的研究

国家自然科学基金

0+阅读 · 2009年12月31日

HOXD13与GLI3基因在马蹄内翻足发病机制中的意义研究

国家自然科学基金

0+阅读 · 2009年12月31日

高通量基因数据分析中的 Bayes 统计方法

国家自然科学基金

1+阅读 · 2008年12月31日

Theoretical analysis of edit distance algorithms: an applied perspective

Arxiv

0+阅读 · 2022年4月20日

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Arxiv

0+阅读 · 2022年4月20日

Adaptive measurement filter: efficient strategy for optimal estimation of quantum Markov chains

Adaptive measurement filter: efficient strategy for optimal estimation of quantum Markov chains

Arxiv

1+阅读 · 2022年4月19日

Reversible Gromov-Monge Sampler for Simulation-Based Inference

Arxiv

0+阅读 · 2022年4月18日

Convergence analysis of a two-grid method for nonsymmetric positive definite problems

Arxiv

0+阅读 · 2022年4月17日

Polynomial-time sparse measure recovery

Arxiv

0+阅读 · 2022年4月16日

Riemannian optimization using three different metrics for Hermitian PSD fixed-rank constraints: an extended version

Arxiv

0+阅读 · 2022年4月16日

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Arxiv

0+阅读 · 2022年4月15日

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

Arxiv

0+阅读 · 2022年4月15日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

VIP会员

文章信息

相关主题

梯度上升法

估计/估计量

相关VIP内容

【深度神经网络加速器的硬件近似技术综述】Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

【深度神经网络加速器的硬件近似技术综述】Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey

专知会员服务

16+阅读 · 2022年3月17日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

ICLR 2020 高质量强化学习论文汇总

ICLR 2020 高质量强化学习论文汇总

极市平台

12+阅读 · 2019年11月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇图像检索相关论文—多模态反馈、二值约束深度哈希、绘制草图、对话交互式、多目标图像检索

【论文推荐】最新六篇图像检索相关论文—多模态反馈、二值约束深度哈希、绘制草图、对话交互式、多目标图像检索

专知

14+阅读 · 2018年6月11日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Theoretical analysis of edit distance algorithms: an applied perspective

Arxiv

0+阅读 · 2022年4月20日

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Arxiv

0+阅读 · 2022年4月20日

Adaptive measurement filter: efficient strategy for optimal estimation of quantum Markov chains

Adaptive measurement filter: efficient strategy for optimal estimation of quantum Markov chains

Arxiv

1+阅读 · 2022年4月19日

Reversible Gromov-Monge Sampler for Simulation-Based Inference

Arxiv

0+阅读 · 2022年4月18日

Convergence analysis of a two-grid method for nonsymmetric positive definite problems

Arxiv

0+阅读 · 2022年4月17日

Polynomial-time sparse measure recovery

Arxiv

0+阅读 · 2022年4月16日

Riemannian optimization using three different metrics for Hermitian PSD fixed-rank constraints: an extended version

Arxiv

0+阅读 · 2022年4月16日

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Arxiv

0+阅读 · 2022年4月15日

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

Arxiv

0+阅读 · 2022年4月15日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月15日

相关基金

以PqsR为靶点筛选铜绿假单胞菌群体感应调控抑制剂及联合用药研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于前馈环研究肝癌miRNA和转录因子的基因表达调控作用

国家自然科学基金

0+阅读 · 2013年12月31日

eCB介导的DSI效应在麻醉-觉醒调节中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

利用高通量RNAi技术研究肺鳞癌EGFR-TKI原发性耐药机制

国家自然科学基金

0+阅读 · 2012年12月31日

c-Abl基因缺失与PrPSc诱导神经元细胞氧化应激机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于单体型的基因统计关联分析

国家自然科学基金

0+阅读 · 2012年12月31日

面向海量超高维数据的随机森林算法理论及优化方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

针刺长强对FMRX1基因敲除小鼠突触可塑性影响的研究

国家自然科学基金

0+阅读 · 2009年12月31日

HOXD13与GLI3基因在马蹄内翻足发病机制中的意义研究

国家自然科学基金

0+阅读 · 2009年12月31日

高通量基因数据分析中的 Bayes 统计方法

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员