关于普通沙尘运动会中电子计算Markov 完美平衡的复杂性 (On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games) - 专知论文

会员服务 ·

0

查准率/准确率 · 置信度 · Processing（编程语言） · 相似度 · 近似 ·

2021 年 9 月 4 日

On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games

翻译：关于普通沙尘运动会中电子计算Markov 完美平衡的复杂性

Xiaotie Deng,Yuhao Li,David Henry Mguni,Jun Wang,Yaodong Yang

Similar to the role of Markov decision processes in reinforcement learning, Stochastic Games (SGs) lay the foundation for the study of multi-agent reinforcement learning (MARL) and sequential agent interactions. In this paper, we derive that computing an approximate Markov Perfect Equilibrium (MPE) in a finite-state discounted Stochastic Game within the exponential precision is \textbf{PPAD}-complete. We adopt a function with a polynomially bounded description in the strategy space to convert the MPE computation to a fixed-point problem, even though the stochastic game may demand an exponential number of pure strategies, in the number of states, for each agent. The completeness result follows the reduction of the fixed-point problem to {\sc End of the Line}. Our results indicate that finding an MPE in SGs is highly unlikely to be \textbf{NP}-hard unless \textbf{NP}=\textbf{co-NP}. Our work offers confidence for MARL research to study MPE computation on general-sum SGs and to develop fruitful algorithms as currently on zero-sum SGs.

翻译：与Markov决策程序在强化学习中的作用相似, 沙沙运动会为研究多剂强化学习( MARL) 和相继剂相互作用奠定了基础。在本文中, 我们得出, 在指数精确度范围内计算一个有限且有价折扣的沙沙游戏中, 大约的Markov 完美平衡( MPE) 是完成的。我们采用了一种功能, 在战略空间中将MPE 计算转换成一个固定点问题时, 使用一个多球体的描述, 尽管沙沙游戏可能要求每个代理体在数量上有指数数的纯战略。完整的结果是固定点问题减少至 ~ c 线的结束。我们的结果表明, 在SG 中找到 MPE 极不可能是 textbf{ PPAD}- 硬的。除非在战略空间中找到 textbf{ NPT{ textb{ textb{ { fco{ { { co- NPT}, 我们的工作为MARL的研究提供了信心, 来研究一般和SG 的计算结果, SG 。

0

相关内容

查准率/准确率

查准率/准确率

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【清华大学】自动微分蒙特卡洛，理论与应用，Automatic Differentiable Monte Carlo: Theory and Application (附pdf）

专知会员服务

28+阅读 · 2019年11月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Scheduling Jobs with Stochastic Holding Costs

Arxiv

0+阅读 · 2021年10月26日

Time Complexity Analysis of an Evolutionary Algorithm for approximating Nash Equilibriums

Arxiv

0+阅读 · 2021年10月26日

Playing Repeated Coopetitive Polymatrix Games with Small Manipulation Cost

Arxiv

0+阅读 · 2021年10月26日

A Constructive Proof of the Glivenko-Cantelli Theorem

Arxiv

0+阅读 · 2021年10月25日

Learning Stochastic Shortest Path with Linear Function Approximation

Arxiv

0+阅读 · 2021年10月25日

On the Optimal Feedback Law in Stochastic Optimal Nonlinear Control

Arxiv

0+阅读 · 2021年10月25日

Learner-Private Convex Optimization

Arxiv

0+阅读 · 2021年10月23日

Voting algorithms for unique games on complete graphs

Arxiv

0+阅读 · 2021年10月22日

Projection-Free Algorithm for Stochastic Bi-level Optimization

Arxiv

0+阅读 · 2021年10月22日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

VIP会员

文章信息

相关主题

查准率/准确率

Processing（编程语言）

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【清华大学】自动微分蒙特卡洛，理论与应用，Automatic Differentiable Monte Carlo: Theory and Application (附pdf）

专知会员服务

28+阅读 · 2019年11月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Scheduling Jobs with Stochastic Holding Costs

Arxiv

0+阅读 · 2021年10月26日

Time Complexity Analysis of an Evolutionary Algorithm for approximating Nash Equilibriums

Arxiv

0+阅读 · 2021年10月26日

Playing Repeated Coopetitive Polymatrix Games with Small Manipulation Cost

Arxiv

0+阅读 · 2021年10月26日

A Constructive Proof of the Glivenko-Cantelli Theorem

Arxiv

0+阅读 · 2021年10月25日

Learning Stochastic Shortest Path with Linear Function Approximation

Arxiv

0+阅读 · 2021年10月25日

On the Optimal Feedback Law in Stochastic Optimal Nonlinear Control

Arxiv

0+阅读 · 2021年10月25日

Learner-Private Convex Optimization

Arxiv

0+阅读 · 2021年10月23日

Voting algorithms for unique games on complete graphs

Arxiv

0+阅读 · 2021年10月22日

Projection-Free Algorithm for Stochastic Bi-level Optimization

Arxiv

0+阅读 · 2021年10月22日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

微信扫码咨询专知VIP会员