与单一主计长一起在Markov游戏中玩假冒玩 (Fictitious Play in Markov Games with Single Controller) - 专知论文

会员服务 ·

0

控制器 · 学成 · 纳什均衡 · 类别 · 情景 ·

2022 年 5 月 23 日

Fictitious Play in Markov Games with Single Controller

翻译：与单一主计长一起在Markov游戏中玩假冒玩

Muhammed O. Sayin,Kaiqing Zhang,Asuman Ozdaglar

from arxiv, Accepted to ACM Conference on Economics and Computation (EC) 2022

Certain but important classes of strategic-form games, including zero-sum and identical-interest games, have the fictitious-play-property (FPP), i.e., beliefs formed in fictitious play dynamics always converge to a Nash equilibrium (NE) in the repeated play of these games. Such convergence results are seen as a (behavioral) justification for the game-theoretical equilibrium analysis. Markov games (MGs), also known as stochastic games, generalize the repeated play of strategic-form games to dynamic multi-state settings with Markovian state transitions. In particular, MGs are standard models for multi-agent reinforcement learning -- a reviving research area in learning and games, and their game-theoretical equilibrium analyses have also been conducted extensively. However, whether certain classes of MGs have the FPP or not (i.e., whether there is a behavioral justification for equilibrium analysis or not) remains largely elusive. In this paper, we study a new variant of fictitious play dynamics for MGs and show its convergence to an NE in n-player identical-interest MGs in which a single player controls the state transitions. Such games are of interest in communications, control, and economics applications. Our result together with the recent results in [Sayin et al. 2020] establishes the FPP of two-player zero-sum MGs and n-player identical-interest MGs with a single controller (standing at two different ends of the MG spectrum from fully competitive to fully cooperative).

翻译：某些但重要的战略形式游戏类别,包括零和和相同利益游戏,都具有虚拟游戏-财产(FPP)的虚拟游戏-财产(FPP),即以虚拟游戏动态形成的信念在游戏的反复游戏中总是会与纳什平衡(NE)趋同。这种趋同结果被视为游戏-理论平衡分析的(行为)理由。马可夫游戏(MGs)也称为随机游戏,它一般地将战略形式游戏的反复玩耍变成充满活力的多层游戏(FPP),特别是,MGs是多剂强化学习的标准模式 -- -- 一个在学习和游戏中恢复研究的领域,而他们的游戏-理论平衡分析也广泛进行。然而,某些MGs班是否具有(行为上的理由进行平衡分析还是没有)仍然基本上难以实现。在本文中,我们研究了一种为MGs(Mcs)反复播放的虚拟游戏和多层(NNEEE)游戏的组合组合组合模式,这是学习和游戏中两个相同的研究领域MGMG的更新研究领域研究领域,而这种游戏的单个游戏和CMGMG结果与我们的单一游戏的相互利益控制结果。

0

相关内容

控制器

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

锂离子电池功能化复合聚合物电解质的制备及性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Vaspin在胰岛β细胞炎症、胰岛素抵抗及氧化应激中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

脉冲电流在SiCp/Al多层结构热冲压/TLP连接复合工艺过程中的作用机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

块体热电材料的热变形诱导再结晶与性能优化

国家自然科学基金

0+阅读 · 2012年12月31日

基于糖化合物“Ferrier Carbocyclization”汞离子荧光探针的设计、合成及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

多铁性LSCMO/PMN-PT磁电复合薄膜的制备、表征及原型器件探索

国家自然科学基金

0+阅读 · 2012年12月31日

基于对运动神经元智能探索的新型自适应学习控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

黑曲霉（Aspergillus niger）对含钾矿物的生物风化与调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation

Arxiv

1+阅读 · 2022年7月10日

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback

Arxiv

0+阅读 · 2022年7月10日

Online Learning in Supply-Chain Games

Arxiv

0+阅读 · 2022年7月8日

Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月8日

For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

Arxiv

0+阅读 · 2022年7月7日

Smooth Fictitious Play in Stochastic Games with Perturbed Payoffs and Unknown Transitions

Arxiv

0+阅读 · 2022年7月7日

On the instrumental variable estimation with many weak and invalid instruments

Arxiv

0+阅读 · 2022年7月7日

Constrained Heterogeneous Two-facility Location Games with Max-variant Cost

Arxiv

0+阅读 · 2022年7月7日

The Confluence of Networks, Games and Learning

Arxiv

94+阅读 · 2021年5月17日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation

Arxiv

1+阅读 · 2022年7月10日

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback

Arxiv

0+阅读 · 2022年7月10日

Online Learning in Supply-Chain Games

Arxiv

0+阅读 · 2022年7月8日

Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月8日

For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria

Arxiv

0+阅读 · 2022年7月7日

Smooth Fictitious Play in Stochastic Games with Perturbed Payoffs and Unknown Transitions

Arxiv

0+阅读 · 2022年7月7日

On the instrumental variable estimation with many weak and invalid instruments

Arxiv

0+阅读 · 2022年7月7日

Constrained Heterogeneous Two-facility Location Games with Max-variant Cost

Arxiv

0+阅读 · 2022年7月7日

The Confluence of Networks, Games and Learning

Arxiv

94+阅读 · 2021年5月17日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

相关基金

锂离子电池功能化复合聚合物电解质的制备及性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Vaspin在胰岛β细胞炎症、胰岛素抵抗及氧化应激中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

脉冲电流在SiCp/Al多层结构热冲压/TLP连接复合工艺过程中的作用机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

块体热电材料的热变形诱导再结晶与性能优化

国家自然科学基金

0+阅读 · 2012年12月31日

基于糖化合物“Ferrier Carbocyclization”汞离子荧光探针的设计、合成及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

多铁性LSCMO/PMN-PT磁电复合薄膜的制备、表征及原型器件探索

国家自然科学基金

0+阅读 · 2012年12月31日

基于对运动神经元智能探索的新型自适应学习控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

黑曲霉（Aspergillus niger）对含钾矿物的生物风化与调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员