重复有限次博弈中局部次优行为的出现 (Emergence of Locally Suboptimal Behavior in Finitely Repeated Games) - 专知论文

会员服务 ·

0

博弈 · 混合策略 · 均衡 · 回合 · 混合 ·

2023 年 3 月 29 日

Emergence of Locally Suboptimal Behavior in Finitely Repeated Games

翻译：重复有限次博弈中局部次优行为的出现

Yichen Yang,Martin Rinard

We study the emergence of locally suboptimal behavior in finitely repeated games. Locally suboptimal behavior refers to players play suboptimally in some rounds of the repeated game (i.e., not maximizing their payoffs in those rounds) while maximizing their total payoffs in the whole repeated game. The central research question we aim to answer is when locally suboptimal behavior can arise from rational play in finitely repeated games. In this research, we focus on the emergence of locally suboptimal behavior in subgame-perfect equilibria (SPE) of finitely repeated games with complete information. We prove the first sufficient and necessary condition on the stage game G that ensure that, for all T and all subgame-perfect equilibria of the repeated game G(T), the strategy profile at every round of G(T) forms a Nash equilibrium of the stage game G. We prove the sufficient and necessary conditions for three cases: 1) only pure strategies are allowed, 2) the general case where mixed strategies are allowed, and 3) one player can only use pure strategies and the other player can use mixed strategies. Based on these results, we obtain complete characterizations on when allowing players to play mixed strategies will change whether local suboptimality can ever occur in some repeated game. Furthermore, we present an algorithm for the computational problem of, given an arbitrary stage game, deciding if locally suboptimal behavior can arise in the corresponding finitely repeated games. This addresses the practical side of the research question.

翻译：我们研究了在有限次重复博弈中局部次优行为的出现。局部次优行为是指玩家在某些游戏回合中表现出不完全最大化他们的收益而在整个博弈中最大化其总收益。我们旨在回答的核心研究问题是：在有限次重复博弈中局部次优行为何时可以从理性游戏中产生。在这项研究中，我们关注完全信息下子博弈完美均衡（SPE）中局部次优行为的出现。我们证明了第一个充分必要条件，确保在所有T和重复博弈G（T）的所有子博弈完美策略均衡中，G（T）每个回合的策略剖面都形成了G的纳什均衡。我们证明了三种情况的充分必要条件：1）只允许使用纯策略，2）允许使用混合策略的一般情况，和3）一个玩家只能使用纯策略，另一个玩家可以使用混合策略。基于这些结果，我们得到了一个完整的表征，说明允许玩家使用混合策略是否会改变局部次优性在一些重复博弈中的出现。此外，我们提出了一种算法，用于计算给定任意阶段游戏时，决定在相应的有限重复游戏中是否可能出现局部次优行为。这解决了研究问题的实际问题。

0

相关内容

JCIM丨DRlinker：深度强化学习优化片段连接设计

JCIM丨DRlinker：深度强化学习优化片段连接设计

专知会员服务

7+阅读 · 2022年12月9日

【多目标多智能体系统决策】196页PDF布鲁塞尔自由大学博士论文，Decision Making in Multi-Objective Multi-Agent Systems——A Utility-Based Perspective

【多目标多智能体系统决策】196页PDF布鲁塞尔自由大学博士论文，Decision Making in Multi-Objective Multi-Agent Systems——A Utility-Based Perspective

专知会员服务

118+阅读 · 2022年3月18日

【布朗大学David Abel博士论文】A Theory of Abstraction in Reinforcement Learning

【布朗大学David Abel博士论文】A Theory of Abstraction in Reinforcement Learning

专知会员服务

25+阅读 · 2022年3月16日

【KDD2021】基于因果反事实Shapley的MARL信度分配

专知会员服务

19+阅读 · 2021年7月11日

NeurIPS 2020最佳论文奖项出炉！GPT-3、伯克利等3篇论文摘得！

NeurIPS 2020最佳论文奖项出炉！GPT-3、伯克利等3篇论文摘得！

专知会员服务

11+阅读 · 2020年12月8日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

73+阅读 · 2020年7月28日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

专知会员服务

16+阅读 · 2019年12月10日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【泡泡一分钟】优化对比度增强以提高SLAM重定位环境中视觉跟踪的稳健性

【泡泡一分钟】优化对比度增强以提高SLAM重定位环境中视觉跟踪的稳健性

泡泡机器人SLAM

10+阅读 · 2019年4月26日

腊月廿八 | 强化学习-TRPO和PPO背后的数学

腊月廿八 | 强化学习-TRPO和PPO背后的数学

AI研习社

18+阅读 · 2019年2月2日

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

泡泡机器人SLAM

23+阅读 · 2019年1月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

平面上几类椭圆型方程解的集中现象

国家自然科学基金

0+阅读 · 2015年12月31日

大脑后顶叶皮层内的空间编码和多感觉整合

国家自然科学基金

1+阅读 · 2014年12月31日

随机偏微分方程及其障碍问题的研究

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于物理和几何的相变与凝聚现象

国家自然科学基金

0+阅读 · 2012年12月31日

两类Monge-Ampere方程问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

LaNbO4增韧固体氧化物燃料电池阳极支撑体NiO-YSZ复合陶瓷的研究

国家自然科学基金

0+阅读 · 2011年12月31日

非线性微分方程的奇异边值问题与周期解分支

国家自然科学基金

1+阅读 · 2008年12月31日

纳米磁性微粒稀土掺杂导电聚合物磁热及热电性质研究

国家自然科学基金

1+阅读 · 2008年12月31日

Learning in Repeated Interactions on Networks

Arxiv

0+阅读 · 2023年5月18日

Client Selection for Federated Policy Optimization with Environment Heterogeneity

Arxiv

0+阅读 · 2023年5月18日

The Blessing of Heterogeneity in Federated Q-learning: Linear Speedup and Beyond

Arxiv

0+阅读 · 2023年5月18日

Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年5月17日

Linear Query Approximation Algorithms for Non-monotone Submodular Maximization under Knapsack Constraint

Arxiv

0+阅读 · 2023年5月17日

Approximating Partial Likelihood Estimators via Optimal Subsampling

Arxiv

0+阅读 · 2023年5月17日

The Hessian perspective into the Nature of Convolutional Neural Networks

Arxiv

0+阅读 · 2023年5月16日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

VIP会员

文章信息

相关主题

相关VIP内容

JCIM丨DRlinker：深度强化学习优化片段连接设计

JCIM丨DRlinker：深度强化学习优化片段连接设计

专知会员服务

7+阅读 · 2022年12月9日

【多目标多智能体系统决策】196页PDF布鲁塞尔自由大学博士论文，Decision Making in Multi-Objective Multi-Agent Systems——A Utility-Based Perspective

【多目标多智能体系统决策】196页PDF布鲁塞尔自由大学博士论文，Decision Making in Multi-Objective Multi-Agent Systems——A Utility-Based Perspective

专知会员服务

118+阅读 · 2022年3月18日

【布朗大学David Abel博士论文】A Theory of Abstraction in Reinforcement Learning

【布朗大学David Abel博士论文】A Theory of Abstraction in Reinforcement Learning

专知会员服务

25+阅读 · 2022年3月16日

【KDD2021】基于因果反事实Shapley的MARL信度分配

专知会员服务

19+阅读 · 2021年7月11日

NeurIPS 2020最佳论文奖项出炉！GPT-3、伯克利等3篇论文摘得！

NeurIPS 2020最佳论文奖项出炉！GPT-3、伯克利等3篇论文摘得！

专知会员服务

11+阅读 · 2020年12月8日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

73+阅读 · 2020年7月28日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

【Facebook|AAAI2020】在合作的部分可观察博弈中通过搜索改进策略（Improving Policies via Search in Cooperative Partially Observable Games）

专知会员服务

16+阅读 · 2019年12月10日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【泡泡一分钟】优化对比度增强以提高SLAM重定位环境中视觉跟踪的稳健性

【泡泡一分钟】优化对比度增强以提高SLAM重定位环境中视觉跟踪的稳健性

泡泡机器人SLAM

10+阅读 · 2019年4月26日

腊月廿八 | 强化学习-TRPO和PPO背后的数学

腊月廿八 | 强化学习-TRPO和PPO背后的数学

AI研习社

18+阅读 · 2019年2月2日

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

【泡泡一分钟】DS-SLAM: 动态环境下的语义视觉SLAM

泡泡机器人SLAM

23+阅读 · 2019年1月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Learning in Repeated Interactions on Networks

Arxiv

0+阅读 · 2023年5月18日

Client Selection for Federated Policy Optimization with Environment Heterogeneity

Arxiv

0+阅读 · 2023年5月18日

The Blessing of Heterogeneity in Federated Q-learning: Linear Speedup and Beyond

Arxiv

0+阅读 · 2023年5月18日

Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年5月17日

Linear Query Approximation Algorithms for Non-monotone Submodular Maximization under Knapsack Constraint

Arxiv

0+阅读 · 2023年5月17日

Approximating Partial Likelihood Estimators via Optimal Subsampling

Arxiv

0+阅读 · 2023年5月17日

The Hessian perspective into the Nature of Convolutional Neural Networks

Arxiv

0+阅读 · 2023年5月16日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

相关基金

平面上几类椭圆型方程解的集中现象

国家自然科学基金

0+阅读 · 2015年12月31日

大脑后顶叶皮层内的空间编码和多感觉整合

国家自然科学基金

1+阅读 · 2014年12月31日

随机偏微分方程及其障碍问题的研究

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于物理和几何的相变与凝聚现象

国家自然科学基金

0+阅读 · 2012年12月31日

两类Monge-Ampere方程问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

LaNbO4增韧固体氧化物燃料电池阳极支撑体NiO-YSZ复合陶瓷的研究

国家自然科学基金

0+阅读 · 2011年12月31日

非线性微分方程的奇异边值问题与周期解分支

国家自然科学基金

1+阅读 · 2008年12月31日

纳米磁性微粒稀土掺杂导电聚合物磁热及热电性质研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员