在强化学习学习中审查平均和折扣奖励平均和折扣最佳最佳奖励标准 (Examining average and discounted reward optimality criteria in reinforcement learning) - 专知论文

会员服务 ·

0

优化器 · Learning · motivation · 衰减系数 · 强化学习 ·

2022 年 9 月 1 日

Examining average and discounted reward optimality criteria in reinforcement learning

翻译：在强化学习学习中审查平均和折扣奖励平均和折扣最佳最佳奖励标准

Vektor Dewanto,Marcus Gallagher

from arxiv, 23 pages, restructuring, adding more details

In reinforcement learning (RL), the goal is to obtain an optimal policy, for which the optimality criterion is fundamentally important. Two major optimality criteria are average and discounted rewards. While the latter is more popular, it is problematic to apply in environments without an inherent notion of discounting. This motivates us to revisit a) the progression of optimality criteria in dynamic programming, b) justification for and complication of an artificial discount factor, and c) benefits of directly maximizing the average reward criterion, which is discounting-free. Our contributions include a thorough examination of the relationship between average and discounted rewards, as well as a discussion of their pros and cons in RL. We emphasize that average-reward RL methods possess the ingredient and mechanism for applying a family of discounting-free optimality criteria (Veinott, 1969) to RL.

翻译：在强化学习(RL)中,目标是获得最佳政策,其最佳标准具有根本重要性。两个主要最佳标准是平均和折扣奖励。虽然后者比较普遍,但在没有固有的折扣概念的环境中适用却成问题。这促使我们重新审视:(a) 动态方案拟订中最佳标准的进展;(b) 人为贴现因素的理由和复杂性;(c) 直接尽量扩大平均奖励标准的好处,这是无折扣的。我们的贡献包括彻底审查平均和折扣奖励之间的关系,以及在RL中讨论其利弊。我们强调,平均奖励RL方法具有将无折扣最佳标准(Viinott,1969年)适用于RL的成分和机制。

0

相关内容

优化器

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

金属/石墨烯纳米天线探针SERS与SEIRA光谱双增强机理及关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

拟线性退化抛物-双曲型方程(组)的初边值问题

国家自然科学基金

0+阅读 · 2013年12月31日

Ghrelin抑制糖脂毒性诱导的心肌胰岛素抵抗的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

帕金森病患者成纤维细胞线粒体功能异常的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

Average-Reward Off-Policy Policy Evaluation with Function Approximation

Arxiv

0+阅读 · 2022年10月18日

Deep Black-Box Reinforcement Learning with Movement Primitives

Arxiv

0+阅读 · 2022年10月18日

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

Arxiv

0+阅读 · 2022年10月16日

Geometric Reinforcement Learning: The Case of Cartesian Space Orientation

Arxiv

0+阅读 · 2022年10月14日

Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection

Arxiv

0+阅读 · 2022年10月14日

Adaptable Claim Rewriting with Offline Reinforcement Learning for Effective Misinformation Discovery

Arxiv

0+阅读 · 2022年10月14日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

A Survey of Machine Learning for Computer Architecture and Systems

Arxiv

18+阅读 · 2021年2月16日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

VIP会员

文章信息

相关主题

相关VIP内容

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Average-Reward Off-Policy Policy Evaluation with Function Approximation

Arxiv

0+阅读 · 2022年10月18日

Deep Black-Box Reinforcement Learning with Movement Primitives

Arxiv

0+阅读 · 2022年10月18日

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

Arxiv

0+阅读 · 2022年10月16日

Geometric Reinforcement Learning: The Case of Cartesian Space Orientation

Arxiv

0+阅读 · 2022年10月14日

Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection

Arxiv

0+阅读 · 2022年10月14日

Adaptable Claim Rewriting with Offline Reinforcement Learning for Effective Misinformation Discovery

Arxiv

0+阅读 · 2022年10月14日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

A Survey of Machine Learning for Computer Architecture and Systems

Arxiv

18+阅读 · 2021年2月16日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

金属/石墨烯纳米天线探针SERS与SEIRA光谱双增强机理及关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

拟线性退化抛物-双曲型方程(组)的初边值问题

国家自然科学基金

0+阅读 · 2013年12月31日

Ghrelin抑制糖脂毒性诱导的心肌胰岛素抵抗的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

帕金森病患者成纤维细胞线粒体功能异常的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员