平均支付、总支付额和可计数 MDP中点支付目标的战略复杂性 (Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs) - 专知论文

会员服务 ·

0

均值 · CASES · 无限 · Processing（编程语言） · 极大 ·

2021 年 7 月 10 日

Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs

翻译：平均支付、总支付额和可计数 MDP中点支付目标的战略复杂性

Richard Mayr,Eric Munday

from arxiv, Full version of a conference paper at CONCUR 2021. 41 pages

We study countably infinite Markov decision processes (MDPs) with real-valued transition rewards. Every infinite run induces the following sequences of payoffs: 1. Point payoff (the sequence of directly seen transition rewards), 2. Total payoff (the sequence of the sums of all rewards so far), and 3. Mean payoff. For each payoff type, the objective is to maximize the probability that the $\liminf$ is non-negative. We establish the complete picture of the strategy complexity of these objectives, i.e., how much memory is necessary and sufficient for $\varepsilon$-optimal (resp. optimal) strategies. Some cases can be won with memoryless deterministic strategies, while others require a step counter, a reward counter, or both.

翻译：我们用真实价值的过渡奖励来研究无穷无尽的Markov决策程序(MDPs ) 。每一场无限运行都引出以下一系列的回报: 1. 点回报(直接看到过渡奖励的顺序 ), 2. 全部回报(到目前为止所有奖励的金额顺序 ), 和 3. 平均回报。对于每一种回报类型,目标是最大限度地提高美元/利宾美元并非负值的概率。我们确定了这些目标的战略复杂性的完整图景,即对美元/瓦雷普西隆元-最佳(最佳)战略而言,多少记忆是必需和足够的。有些案例可以用没有记忆的决定性战略获胜,而另一些则需要一步反弹、奖励反弹或两者兼而有之。

0

相关内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Cayley图数据库的可视化（Visualize）

Cayley图数据库的可视化（Visualize）

Python开发者

5+阅读 · 2019年9月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

专知

6+阅读 · 2017年12月11日

【关关的刷题日记47】Leetcode 38. Count and Say

【关关的刷题日记47】Leetcode 38. Count and Say

专知

3+阅读 · 2017年11月25日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【LeetCode 202】关关的刷题日记35 – Leetcode 202. Happy Number

【LeetCode 202】关关的刷题日记35 – Leetcode 202. Happy Number

专知

5+阅读 · 2017年11月13日

【Code】关关的刷题日记21——Leetcode 485. Max Consecutive Ones

【Code】关关的刷题日记21——Leetcode 485. Max Consecutive Ones

专知

3+阅读 · 2017年10月30日

关关的刷题日记13——Leetcode 414. Third Maximum Number

关关的刷题日记13——Leetcode 414. Third Maximum Number

专知

3+阅读 · 2017年10月8日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

关关的刷题日记01—Leetcode 169. Majority Element

关关的刷题日记01—Leetcode 169. Majority Element

专知

3+阅读 · 2017年9月21日

MAX CUT in Weighted Random Intersection Graphs and Discrepancy of Sparse Random Set Systems

Arxiv

0+阅读 · 2021年9月14日

Optimal pointwise sampling for $L^2$ approximation

Arxiv

0+阅读 · 2021年9月13日

Improved Algorithms for Misspecified Linear Markov Decision Processes

Arxiv

0+阅读 · 2021年9月12日

Strong Laws of Large Numbers for Generalizations of Fréchet Mean Sets

Arxiv

0+阅读 · 2021年9月12日

Optimal Bounds for the $k$-cut Problem

Arxiv

0+阅读 · 2021年9月12日

On $Δ$-Modular Integer Linear Problems In The Canonical Form And Equivalent Problems

Arxiv

0+阅读 · 2021年9月11日

Kelly Betting with Quantum Payoff: a continuous variable approach

Arxiv

0+阅读 · 2021年9月11日

A Note on Projection-Based Recovery of Clusters in Markov Chains

Arxiv

0+阅读 · 2021年9月11日

RandSolomon: optimally resilient multi-party random number generation protocol

RandSolomon: optimally resilient multi-party random number generation protocol

Arxiv

0+阅读 · 2021年9月10日

Efficient Locally Optimal Number Set Partitioning for Scheduling, Allocation and Fair Selection

Arxiv

0+阅读 · 2021年9月10日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

Cayley图数据库的可视化（Visualize）

Cayley图数据库的可视化（Visualize）

Python开发者

5+阅读 · 2019年9月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

专知

6+阅读 · 2017年12月11日

【关关的刷题日记47】Leetcode 38. Count and Say

【关关的刷题日记47】Leetcode 38. Count and Say

专知

3+阅读 · 2017年11月25日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【LeetCode 202】关关的刷题日记35 – Leetcode 202. Happy Number

【LeetCode 202】关关的刷题日记35 – Leetcode 202. Happy Number

专知

5+阅读 · 2017年11月13日

【Code】关关的刷题日记21——Leetcode 485. Max Consecutive Ones

【Code】关关的刷题日记21——Leetcode 485. Max Consecutive Ones

专知

3+阅读 · 2017年10月30日

关关的刷题日记13——Leetcode 414. Third Maximum Number

关关的刷题日记13——Leetcode 414. Third Maximum Number

专知

3+阅读 · 2017年10月8日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

关关的刷题日记01—Leetcode 169. Majority Element

关关的刷题日记01—Leetcode 169. Majority Element

专知

3+阅读 · 2017年9月21日

相关论文

MAX CUT in Weighted Random Intersection Graphs and Discrepancy of Sparse Random Set Systems

Arxiv

0+阅读 · 2021年9月14日

Optimal pointwise sampling for $L^2$ approximation

Arxiv

0+阅读 · 2021年9月13日

Improved Algorithms for Misspecified Linear Markov Decision Processes

Arxiv

0+阅读 · 2021年9月12日

Strong Laws of Large Numbers for Generalizations of Fréchet Mean Sets

Arxiv

0+阅读 · 2021年9月12日

Optimal Bounds for the $k$-cut Problem

Arxiv

0+阅读 · 2021年9月12日

On $Δ$-Modular Integer Linear Problems In The Canonical Form And Equivalent Problems

Arxiv

0+阅读 · 2021年9月11日

Kelly Betting with Quantum Payoff: a continuous variable approach

Arxiv

0+阅读 · 2021年9月11日

A Note on Projection-Based Recovery of Clusters in Markov Chains

Arxiv

0+阅读 · 2021年9月11日

RandSolomon: optimally resilient multi-party random number generation protocol

RandSolomon: optimally resilient multi-party random number generation protocol

Arxiv

0+阅读 · 2021年9月10日

Efficient Locally Optimal Number Set Partitioning for Scheduling, Allocation and Fair Selection

Arxiv

0+阅读 · 2021年9月10日

微信扫码咨询专知VIP会员