粉尘游戏中的阿雷亚独立性有限计量确定力 (Arena-Independent Finite-Memory Determinacy in Stochastic Games) - 专知论文

会员服务 ·

0

优化器 · 可约的 · CONCUR · 可理解性 · TOOLS ·

2021 年 2 月 19 日

Arena-Independent Finite-Memory Determinacy in Stochastic Games

翻译：粉尘游戏中的阿雷亚独立性有限计量确定力

Patricia Bouyer,Youssouf Oualhadj,Mickael Randour,Pierre Vandenhove

from arxiv, 35 pages, 3 figures

We study stochastic zero-sum games on graphs, which are prevalent tools to model decision-making in presence of an antagonistic opponent in a random environment. In this setting, an important question is the one of strategy complexity: what kinds of strategies are sufficient or required to play optimally (e.g., randomization or memory requirements)? Our contributions further the understanding of arena-independent finite-memory (AIFM) determinacy, i.e., the study of objectives for which memory is needed, but in a way that only depends on limited parameters of the game graphs. First, we show that objectives for which pure AIFM strategies suffice to play optimally also admit pure AIFM subgame perfect strategies. Second, we show that we can reduce the study of objectives for which pure AIFM strategies suffice in two-player stochastic games to the easier study of one-player stochastic games (i.e., Markov decision processes). Third, we characterize the sufficiency of AIFM strategies through two intuitive properties of objectives. This work extends a line of research started on deterministic games in [BLO+20] to stochastic ones. [BLO+20] Patricia Bouyer, St\'ephane Le Roux, Youssouf Oualhadj, Mickael Randour, and Pierre Vandenhove. Games Where You Can Play Optimally with Arena-Independent Finite Memory. CONCUR 2020.

翻译：我们研究图表上的零和游戏,这是在随机环境中当着对立对手进行模拟决策的常用工具。在这个背景下,一个重要的问题是战略复杂性:什么样的战略足够或需要最优化地发挥(如随机化或记忆要求)?我们的贡献进一步增进了对视场独立有限游戏(AIFM)确定性的理解,即对需要记忆的目标的研究,但只能依赖游戏图表的有限参数。首先,我们展示了纯的AIFM战略足以最理想地接受纯的AIFM亚游戏完美战略的目标:哪些战略足以或需要最优化地发挥(如随机化或记忆要求)?我们的贡献进一步增进了对单玩游戏(即Markov决定程序)的轻松研究。我们通过目标的两个直观性属性来描述AIFM战略的充足性。本项工作把关于确定性游戏的系列研究范围从PARIC+OVA、OFOA、ROFO和ROFOFO。

0

相关内容

优化器

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【干货书】强化学习算法，98页pdf综合讲解人工智能和机器学习

【干货书】强化学习算法，98页pdf综合讲解人工智能和机器学习

专知会员服务

66+阅读 · 2021年2月21日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

「强化学习之路」清华博士后解读83篇文献，万字长文总结

「强化学习之路」清华博士后解读83篇文献，万字长文总结

专知会员服务

67+阅读 · 2020年2月28日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

已删除

将门创投

4+阅读 · 2019年6月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints

Arxiv

0+阅读 · 2021年4月13日

An Efficient Algorithm for Deep Stochastic Contextual Bandits

An Efficient Algorithm for Deep Stochastic Contextual Bandits

Arxiv

0+阅读 · 2021年4月12日

Personalized Bundle Recommendation in Online Games

Arxiv

0+阅读 · 2021年4月12日

The Theory of Universal Graphs for Infinite Duration Games

Arxiv

0+阅读 · 2021年4月12日

Mean-field Approximation for Stochastic Population Processes in Networks under Imperfect Information

Arxiv

0+阅读 · 2021年4月11日

Categorical Stochastic Processes and Likelihood

Arxiv

0+阅读 · 2021年4月8日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【干货书】强化学习算法，98页pdf综合讲解人工智能和机器学习

【干货书】强化学习算法，98页pdf综合讲解人工智能和机器学习

专知会员服务

66+阅读 · 2021年2月21日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

「强化学习之路」清华博士后解读83篇文献，万字长文总结

「强化学习之路」清华博士后解读83篇文献，万字长文总结

专知会员服务

67+阅读 · 2020年2月28日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

已删除

将门创投

4+阅读 · 2019年6月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints

Arxiv

0+阅读 · 2021年4月13日

An Efficient Algorithm for Deep Stochastic Contextual Bandits

An Efficient Algorithm for Deep Stochastic Contextual Bandits

Arxiv

0+阅读 · 2021年4月12日

Personalized Bundle Recommendation in Online Games

Arxiv

0+阅读 · 2021年4月12日

The Theory of Universal Graphs for Infinite Duration Games

Arxiv

0+阅读 · 2021年4月12日

Mean-field Approximation for Stochastic Population Processes in Networks under Imperfect Information

Arxiv

0+阅读 · 2021年4月11日

Categorical Stochastic Processes and Likelihood

Arxiv

0+阅读 · 2021年4月8日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

微信扫码咨询专知VIP会员