Memory Asymmetry: A Key to Convergence in Zero-Sum Games - 专知论文

会员服务 ·

0

Agent · Learning · 纳什均衡 · 记忆容量 · 梯度上升 ·

2023 年 5 月 23 日

Memory Asymmetry: A Key to Convergence in Zero-Sum Games

翻译：暂无翻译

Yuma Fujimoto,Kaito Ariu,Kenshi Abe

from arxiv, 11 pages & 5 figures (main), 4 pages & 1 figure (appendix)

This study provides a new convergence mechanism in learning in games. Learning in games considers how multiple agents maximize their own rewards through repeated plays of games. Especially in two-player zero-sum games, where agents compete with each other for their rewards, the reward of the agent depends on the opponent's strategy. Thus, a critical problem emerges when both agents learn their strategy following standard algorithms such as replicator dynamics and gradient ascent; their learning dynamics often draw cycles and cannot converge to their optimal strategies, i.e., the Nash equilibrium. We tackle this problem with a novel perspective on asymmetry in learning algorithms between the agents. We consider with-memory games where the agents can store the played actions in their memories in order to choose their subsequent actions. In such games, we focus on the asymmetry in memory capacities between the agents. Interestingly, we demonstrate that learning dynamics converge to the Nash equilibrium when the agents have different memory capacities, from theoretical and experimental aspects. Moreover, we give an interpretation of this convergence; the agent with a longer memory can use a more complex strategy, endowing the utility of the other with strict concavity.

翻译：暂无翻译

0

相关内容

Agent

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

乙型肝炎病毒变异激活人纤维介素基因的转录调控机制

国家自然科学基金

0+阅读 · 2009年12月31日

VEGF基因3'UTR区基因多态性和相关microRNA在肺癌中的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

HOXD13与GLI3基因在马蹄内翻足发病机制中的意义研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

Arxiv

0+阅读 · 2023年7月10日

Shaping the Emerging Norms of Using Large Language Models in Social Computing Research

Arxiv

0+阅读 · 2023年7月9日

A Smoothed FPTAS for Equilibria in Congestion Games

Arxiv

0+阅读 · 2023年7月9日

Higher-order Games with Dependent Types

Arxiv

0+阅读 · 2023年7月7日

Adaptive Strategies in Non-convex Optimization

Arxiv

0+阅读 · 2023年7月7日

VIP会员

文章信息

相关主题

相关VIP内容

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美海军作战管理系统：变革战场空间的二十年

《任务与武器驱动美海军舰队设计》报告

俄罗斯“沙希德”/“天竺葵”攻击无人机

《利用动态图对网络攻击进行建模与仿真：在云安全评估中的应用》90页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

Arxiv

0+阅读 · 2023年7月10日

Shaping the Emerging Norms of Using Large Language Models in Social Computing Research

Arxiv

0+阅读 · 2023年7月9日

A Smoothed FPTAS for Equilibria in Congestion Games

Arxiv

0+阅读 · 2023年7月9日

Higher-order Games with Dependent Types

Arxiv

0+阅读 · 2023年7月7日

Adaptive Strategies in Non-convex Optimization

Arxiv

0+阅读 · 2023年7月7日

相关基金

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

乙型肝炎病毒变异激活人纤维介素基因的转录调控机制

国家自然科学基金

0+阅读 · 2009年12月31日

VEGF基因3'UTR区基因多态性和相关microRNA在肺癌中的作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

HOXD13与GLI3基因在马蹄内翻足发病机制中的意义研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员