以抽样为基础的大型多层运动会通过梯子后裔对纳什的接近 (Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent) - 专知论文

会员服务 ·

0

纳什均衡 · 近似 · 随机梯度下降 · Seven · 迹 ·

2021 年 6 月 2 日

Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent

翻译：以抽样为基础的大型多层运动会通过梯子后裔对纳什的接近

Ian Gemp,Rahul Savani,Marc Lanctot,Yoram Bachrach,Thomas Anthony,Richard Everett,Andrea Tacchetti,Tom Eccles,János Kramár

Nash equilibrium is a central concept in game theory. Several Nash solvers exist, yet none scale to normal-form games with many actions and many players, especially those with payoff tensors too big to be stored in memory. In this work, we propose an approach that iteratively improves an approximation to a Nash equilibrium through joint play. It accomplishes this by tracing a previously established homotopy which connects instances of the game defined with decaying levels of entropy regularization. To encourage iterates to remain near this path, we efficiently minimize \emph{average deviation incentive} via stochastic gradient descent, intelligently sampling entries in the payoff tensor as needed. This process can also be viewed as constructing and reacting to a polymatrix approximation to the game. In these ways, our proposed approach, \emph{average deviation incentive descent with adaptive sampling} (ADIDAS), is most similar to three classical approaches, namely homotopy-type, Lyapunov, and iterative polymatrix solvers. We demonstrate through experiments the ability of this approach to approximate a Nash equilibrium in normal-form games with as many as seven players and twenty one actions (over one trillion outcomes) that are orders of magnitude larger than those possible with prior algorithms.

翻译：纳什平衡是游戏理论中的核心概念。若干纳什解决者存在, 但对于普通游戏而言, 却没有规模, 有许多动作和许多玩家, 尤其是那些付出的加仑过大, 以至于无法存储在记忆中的人。在这项工作中, 我们提出一种方法, 通过联合玩耍, 迭接地改善纳什平衡近似。它通过追踪一个先前建立的同质体, 将游戏中定义的游戏实例与正在衰减的对温调节水平连接起来。为了鼓励迭代者留在这条路径附近, 我们有效地将平均偏移激励力降到最低。我们通过随机梯度梯度下降, 明智地在高压室中取样取用。这一过程也可以被视为构建和应对游戏的多元性近似。在这些方法中, 我们提出的方法, \ emph{ / 平均偏差偏移和适应性采样仪( ADIDASS) 最接近三种古典方法, 即同质型类型、 lyapunov 和迭合质聚胺解解解解算。我们通过实验来, 在正常形式游戏中, 接近纳什平衡的能力, 接近于7个级的游戏中, 和20万亿级的结果是。

0

相关内容

纳什均衡

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【经典书】使用机器学习R语言，149页pdf，Practical Machine Learning in R

【经典书】使用机器学习R语言，149页pdf，Practical Machine Learning in R

专知会员服务

24+阅读 · 2021年1月13日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

tf.GradientTape 详解

tf.GradientTape 详解

TensorFlow

120+阅读 · 2020年2月21日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

OpenAI丨深度强化学习关键论文列表

OpenAI丨深度强化学习关键论文列表

中国人工智能学会

17+阅读 · 2018年11月10日

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

专知

8+阅读 · 2018年11月2日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

A Simple Optimal Contention Resolution Scheme for Uniform Matroids

Arxiv

0+阅读 · 2021年7月27日

Plugin Estimation of Smooth Optimal Transport Maps

Arxiv

0+阅读 · 2021年7月26日

Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis

Arxiv

0+阅读 · 2021年7月26日

The Holy Grail of Multi-Robot Planning: Learning to Generate Online-Scalable Solutions from Offline-Optimal Experts

Arxiv

0+阅读 · 2021年7月26日

Approximation Theory Based Methods for RKHS Bandits

Arxiv

0+阅读 · 2021年7月26日

Numerical approximation of the scattering amplitude in elasticity

Arxiv

0+阅读 · 2021年7月24日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Arxiv

3+阅读 · 2019年6月20日

Practical sketching algorithms for low-rank matrix approximation

Arxiv

4+阅读 · 2018年1月2日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

随机梯度下降

相关VIP内容

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【经典书】使用机器学习R语言，149页pdf，Practical Machine Learning in R

【经典书】使用机器学习R语言，149页pdf，Practical Machine Learning in R

专知会员服务

24+阅读 · 2021年1月13日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

tf.GradientTape 详解

tf.GradientTape 详解

TensorFlow

120+阅读 · 2020年2月21日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

OpenAI丨深度强化学习关键论文列表

OpenAI丨深度强化学习关键论文列表

中国人工智能学会

17+阅读 · 2018年11月10日

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

专知

8+阅读 · 2018年11月2日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

A Simple Optimal Contention Resolution Scheme for Uniform Matroids

Arxiv

0+阅读 · 2021年7月27日

Plugin Estimation of Smooth Optimal Transport Maps

Arxiv

0+阅读 · 2021年7月26日

Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis

Arxiv

0+阅读 · 2021年7月26日

The Holy Grail of Multi-Robot Planning: Learning to Generate Online-Scalable Solutions from Offline-Optimal Experts

Arxiv

0+阅读 · 2021年7月26日

Approximation Theory Based Methods for RKHS Bandits

Arxiv

0+阅读 · 2021年7月26日

Numerical approximation of the scattering amplitude in elasticity

Arxiv

0+阅读 · 2021年7月24日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Arxiv

3+阅读 · 2019年6月20日

Practical sketching algorithms for low-rank matrix approximation

Arxiv

4+阅读 · 2018年1月2日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员