团队工作使冯纽曼工作:Min-Max在两队零-苏姆运动会中实现最优化 (Teamwork makes von Neumann work: Min-Max Optimization in Two-Team Zero-Sum Games) - 专知论文

会员服务 ·

0

TEAM · 优化器 · 纳什均衡 · Networks · 鞍点 ·

2021 年 11 月 9 日

Teamwork makes von Neumann work: Min-Max Optimization in Two-Team Zero-Sum Games

翻译：团队工作使冯纽曼工作:Min-Max在两队零-苏姆运动会中实现最优化

Fivos Kalogiannis,Ioannis Panageas,Emmanouil-Vasileios Vlatakis-Gkaragkounis

Motivated by recent advances in both theoretical and applied aspects of multiplayer games, spanning from e-sports to multi-agent generative adversarial networks, we focus on min-max optimization in team zero-sum games. In this class of games, players are split into two teams with payoffs equal within the same team and of opposite sign across the opponent team. Unlike the textbook two-player zero-sum games, finding a Nash equilibrium in our class can be shown to be CLS-hard, i.e., it is unlikely to have a polynomial-time algorithm for computing Nash equilibria. Moreover, in this generalized framework, we establish that even asymptotic last iterate or time average convergence to a Nash Equilibrium is not possible using Gradient Descent Ascent (GDA), its optimistic variant, and extra gradient. Specifically, we present a family of team games whose induced utility is \emph{non} multi-linear with \emph{non} attractive \emph{per-se} mixed Nash Equilibria, as strict saddle points of the underlying optimization landscape. Leveraging techniques from control theory, we complement these negative results by designing a modified GDA that converges locally to Nash equilibria. Finally, we discuss connections of our framework with AI architectures with team competition structures like multi-agent generative adversarial networks.

翻译：在多玩游戏的理论和应用方面,从电子体育到多试剂基因对抗网络,最近出现了一些进步,从电子体育到多试剂基因对抗网络,我们注重在零和球队零和球队中实现微麦优化。在这一类游戏中,球员分成两个球队,球员在球队中报酬相等,对手队之间则有相反的标志。与教科书中的双球运动员零和球队不同,在我们的球队中找到一个纳什平衡,可以证明是CLS-硬的,也就是说,在计算纳什平衡时,我们不可能有一个从电子体育到多试调的多时空算法。此外,在这个普遍的框架内,我们确定即使最后的重复性或平均趋同纳什平衡是不可能使用同一球队中同等报酬的球队。具体地说,我们展示了一组球队游戏的组合,其诱发的效用是cemph{nonnon-nortial, 也就是说,它是一个具有吸引力的耐姆-per-eme-me 混合的纳什-equiliblibrial 网络,这是我们最基础的理论化结构结构的最后讨论。我们从一个严格的基调的基调结构结构,我们最后讨论。

0

相关内容

TEAM

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

85+阅读 · 2020年2月18日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

A general class of surrogate functions for stable and efficient reinforcement learning

Arxiv

0+阅读 · 2022年1月13日

Game Theory for Adversarial Attacks and Defenses

Arxiv

0+阅读 · 2022年1月12日

Inverse optimization problems with multiple weight functions

Arxiv

0+阅读 · 2022年1月9日

Stability Based Generalization Bounds for Exponential Family Langevin Dynamics

Arxiv

0+阅读 · 2022年1月9日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Arxiv

0+阅读 · 2022年1月7日

Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent

Arxiv

0+阅读 · 2022年1月6日

A Game Theoretical Analysis on the Gray-Wyner System and Generalised Common Information

Arxiv

0+阅读 · 2022年1月5日

Conditional Imitation Learning for Multi-Agent Games

Arxiv

0+阅读 · 2022年1月5日

A Consumer-Theoretic Characterization of Fisher Market Equilibria

Arxiv

0+阅读 · 2022年1月4日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

85+阅读 · 2020年2月18日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《在单一作战合成环境（SSE）中运用人工智能与大型语言模型以提供灵活人文地形及可信角色组》报告

《俄罗斯的未来战争方式第二部分：核威慑》报告

《提示战争：大语言模型如何决定军事干预》报告

《俄罗斯的未来战争方式第三部分：军事改革》报告

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

A general class of surrogate functions for stable and efficient reinforcement learning

Arxiv

0+阅读 · 2022年1月13日

Game Theory for Adversarial Attacks and Defenses

Arxiv

0+阅读 · 2022年1月12日

Inverse optimization problems with multiple weight functions

Arxiv

0+阅读 · 2022年1月9日

Stability Based Generalization Bounds for Exponential Family Langevin Dynamics

Arxiv

0+阅读 · 2022年1月9日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Arxiv

0+阅读 · 2022年1月7日

Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent

Arxiv

0+阅读 · 2022年1月6日

A Game Theoretical Analysis on the Gray-Wyner System and Generalised Common Information

Arxiv

0+阅读 · 2022年1月5日

Conditional Imitation Learning for Multi-Agent Games

Arxiv

0+阅读 · 2022年1月5日

A Consumer-Theoretic Characterization of Fisher Market Equilibria

Arxiv

0+阅读 · 2022年1月4日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

微信扫码咨询专知VIP会员