可观察完美均衡 (Observable Perfect Equilibrium) - 专知论文

会员服务 ·

0

纳什均衡 · 稳健性 · 不完美信息 · Agent · MoDELS ·

2023 年 3 月 17 日

Observable Perfect Equilibrium

翻译：可观察完美均衡

While Nash equilibrium has emerged as the central game-theoretic solution concept, many important games contain several Nash equilibria and we must determine how to select between them in order to create real strategic agents. Several Nash equilibrium refinement concepts have been proposed and studied for sequential imperfect-information games, the most prominent being trembling-hand perfect equilibrium, quasi-perfect equilibrium, and recently one-sided quasi-perfect equilibrium. These concepts are robust to certain arbitrarily small mistakes, and are guaranteed to always exist; however, we argue that neither of these is the correct concept for developing strong agents in sequential games of imperfect information. We define a new equilibrium refinement concept for extensive-form games called observable perfect equilibrium in which the solution is robust over trembles in publicly-observable action probabilities (not necessarily over all action probabilities that may not be observable by opposing players). Observable perfect equilibrium correctly captures the assumption that the opponent is playing as rationally as possible given mistakes that have been observed (while previous solution concepts do not). We prove that observable perfect equilibrium is always guaranteed to exist, and demonstrate that it leads to a different solution than the prior extensive-form refinements in no-limit poker. We expect observable perfect equilibrium to be a useful equilibrium refinement concept for modeling many important imperfect-information games of interest in artificial intelligence.

翻译：尽管纳什均衡已成为中心博弈论解概念，但许多重要游戏包含多个纳什均衡，我们必须确定如何在它们之间进行选择，以创建真正的战略代理。已经提出和研究了几个纳什均衡精化概念，其中最重要的是颤抖手完美均衡、拟完美均衡和最近的单侧拟完美均衡。这些概念对某些任意小的错误是健壮的，并且保证始终存在；然而，我们认为这些概念都不是在具有连续不完美信息的顺序游戏中开发强大代理的正确概念。我们为广泛形式的游戏定义了一个新的均衡精化概念，称为可观察完美均衡，在这种解中，解决方案对于在公开可观察的行动概率中的颤抖是健壮的（不一定对于所有对手不能观察到的行动概率是健壮的）。可观察完美均衡正确地捕捉了对手在观察到的错误下尽可能理性地玩游戏的假设（而之前的解概念没有）。我们证明了可观察完美均衡总是保证存在，并且证明了它导致一个不同于以前无限扑克游戏的广泛形式精化的解。我们期望可观察完美均衡成为建模人工智能中许多重要不完美信息游戏的有用均衡精化概念。

0

相关内容

纳什均衡

【CMU博士论文】现代深度学习的均衡(Equilibrium)方法，155页pdf

【CMU博士论文】现代深度学习的均衡(Equilibrium)方法，155页pdf

专知会员服务

37+阅读 · 2022年6月16日

牛津大学《多智能体影响图的均衡优化: 理论和实践》，Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

牛津大学《多智能体影响图的均衡优化: 理论和实践》，Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

专知会员服务

26+阅读 · 2022年4月10日

DARPA SI3-CMD项目支持，《网络多智能体影响博弈中的可扩展均衡计算》麻省理工、马里兰大学，Scalable Equilibrium Computation in Multi-agent Influence Games on Networks

DARPA SI3-CMD项目支持，《网络多智能体影响博弈中的可扩展均衡计算》麻省理工、马里兰大学，Scalable Equilibrium Computation in Multi-agent Influence Games on Networks

专知会员服务

24+阅读 · 2022年4月10日

【ICML2021】学习权衡不完美的示范

专知会员服务

15+阅读 · 2021年9月23日

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

专知会员服务

26+阅读 · 2020年3月27日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

专知会员服务

16+阅读 · 2019年12月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

专知会员服务

24+阅读 · 2019年5月14日

Scrum 、POC和低质量软件的解决方案

Scrum 、POC和低质量软件的解决方案

InfoQ

0+阅读 · 2022年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

专知

29+阅读 · 2018年3月6日

复杂生产制造环境下的排序问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

脉冲激光delta-掺杂 III-V 族 AlNx（x=3，4）团簇制备高空穴浓度p-型氧化锌的研究

国家自然科学基金

0+阅读 · 2014年12月31日

全空间中临界Surface Quasi-geostrophic方程的全局吸引子及其分形维数

国家自然科学基金

0+阅读 · 2014年12月31日

Lp-Minkowski 问题及相关的 Monge-Ampere 型方程

国家自然科学基金

0+阅读 · 2013年12月31日

Nrf2-Keap1-ARE信号通路在脊髓损伤后胶质瘢痕形成中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

Wnt-Notch和Wnt-ERBB信号通路调控NSCLC上皮间质转化和耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

共形曲面的谱簇的渐近分析

国家自然科学基金

0+阅读 · 2011年12月31日

排队模型与可靠性模型的时间依赖解的结构研究

国家自然科学基金

0+阅读 · 2008年12月31日

Computing Bayes Nash Equilibrium Strategies in Auction Games via Simultaneous Online Dual Averaging

Arxiv

0+阅读 · 2023年5月9日

GNNs,You can be Stronger,Deeper and Faster

Arxiv

1+阅读 · 2023年5月9日

PED-ANOVA: Efficiently Quantifying Hyperparameter Importance in Arbitrary Subspaces

Arxiv

0+阅读 · 2023年5月8日

Repeated Principal-Agent Games with Unobserved Agent Rewards and Perfect-Knowledge Agents

Arxiv

0+阅读 · 2023年5月7日

Signaling Games in Multiple Dimensions: Geometric Properties of Equilibrium Solutions

Arxiv

0+阅读 · 2023年5月7日

Root-n consistent semiparametric learning with high-dimensional nuisance functions under minimal sparsity

Arxiv

0+阅读 · 2023年5月7日

No-Regret Constrained Bayesian Optimization of Noisy and Expensive Hybrid Models using Differentiable Quantile Function Approximations

Arxiv

0+阅读 · 2023年5月5日

Phase Transitions of the Price-of-Anarchy Function in Multi-Commodity Routing Games

Arxiv

0+阅读 · 2023年5月5日

A Review and Roadmap of Deep Learning Causal Discovery in Different Variable Paradigms

Arxiv

22+阅读 · 2022年9月14日

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

Arxiv

15+阅读 · 2021年2月9日

VIP会员

文章信息

相关主题

不完美信息

相关VIP内容

【CMU博士论文】现代深度学习的均衡(Equilibrium)方法，155页pdf

【CMU博士论文】现代深度学习的均衡(Equilibrium)方法，155页pdf

专知会员服务

37+阅读 · 2022年6月16日

牛津大学《多智能体影响图的均衡优化: 理论和实践》，Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

牛津大学《多智能体影响图的均衡优化: 理论和实践》，Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

专知会员服务

26+阅读 · 2022年4月10日

DARPA SI3-CMD项目支持，《网络多智能体影响博弈中的可扩展均衡计算》麻省理工、马里兰大学，Scalable Equilibrium Computation in Multi-agent Influence Games on Networks

DARPA SI3-CMD项目支持，《网络多智能体影响博弈中的可扩展均衡计算》麻省理工、马里兰大学，Scalable Equilibrium Computation in Multi-agent Influence Games on Networks

专知会员服务

24+阅读 · 2022年4月10日

【ICML2021】学习权衡不完美的示范

专知会员服务

15+阅读 · 2021年9月23日

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

专知会员服务

26+阅读 · 2020年3月27日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

专知会员服务

16+阅读 · 2019年12月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

专知会员服务

24+阅读 · 2019年5月14日

热门VIP内容

开通专知VIP会员享更多权益服务

《巡飞弹药（爆炸性无人机）威胁态势分析》最新24页报告

《军用后勤无人机：破解战场运输挑战的创新方案》

人工智能战争：以色列、伊朗与新型AI战争形态

《俄乌战争：现代战争未来的启示与经验》

相关资讯

Scrum 、POC和低质量软件的解决方案

Scrum 、POC和低质量软件的解决方案

InfoQ

0+阅读 · 2022年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

专知

29+阅读 · 2018年3月6日

相关论文

Computing Bayes Nash Equilibrium Strategies in Auction Games via Simultaneous Online Dual Averaging

Arxiv

0+阅读 · 2023年5月9日

GNNs,You can be Stronger,Deeper and Faster

Arxiv

1+阅读 · 2023年5月9日

PED-ANOVA: Efficiently Quantifying Hyperparameter Importance in Arbitrary Subspaces

Arxiv

0+阅读 · 2023年5月8日

Repeated Principal-Agent Games with Unobserved Agent Rewards and Perfect-Knowledge Agents

Arxiv

0+阅读 · 2023年5月7日

Signaling Games in Multiple Dimensions: Geometric Properties of Equilibrium Solutions

Arxiv

0+阅读 · 2023年5月7日

Root-n consistent semiparametric learning with high-dimensional nuisance functions under minimal sparsity

Arxiv

0+阅读 · 2023年5月7日

No-Regret Constrained Bayesian Optimization of Noisy and Expensive Hybrid Models using Differentiable Quantile Function Approximations

Arxiv

0+阅读 · 2023年5月5日

Phase Transitions of the Price-of-Anarchy Function in Multi-Commodity Routing Games

Arxiv

0+阅读 · 2023年5月5日

A Review and Roadmap of Deep Learning Causal Discovery in Different Variable Paradigms

Arxiv

22+阅读 · 2022年9月14日

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

Arxiv

15+阅读 · 2021年2月9日

相关基金

复杂生产制造环境下的排序问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

脉冲激光delta-掺杂 III-V 族 AlNx（x=3，4）团簇制备高空穴浓度p-型氧化锌的研究

国家自然科学基金

0+阅读 · 2014年12月31日

全空间中临界Surface Quasi-geostrophic方程的全局吸引子及其分形维数

国家自然科学基金

0+阅读 · 2014年12月31日

Lp-Minkowski 问题及相关的 Monge-Ampere 型方程

国家自然科学基金

0+阅读 · 2013年12月31日

Nrf2-Keap1-ARE信号通路在脊髓损伤后胶质瘢痕形成中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

Wnt-Notch和Wnt-ERBB信号通路调控NSCLC上皮间质转化和耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

轴对称的Navier-Stokes方程

国家自然科学基金

1+阅读 · 2011年12月31日

共形曲面的谱簇的渐近分析

国家自然科学基金

0+阅读 · 2011年12月31日

排队模型与可靠性模型的时间依赖解的结构研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员