最优化的神经法自残在 " 最遗憾最小化思维 " 中的自残 (Optimize Neural Fictitious Self-Play in Regret Minimization Thinking) - 专知论文

会员服务 ·

0

优化器 · Self-Play · INFORMS · 不完美信息 · Performer ·

2021 年 4 月 22 日

Optimize Neural Fictitious Self-Play in Regret Minimization Thinking

翻译：最优化的神经法自残在 " 最遗憾最小化思维 " 中的自残

Yuxuan Chen,Li Zhang,Shijian Li,Gang Pan

Optimization of deep learning algorithms to approach Nash Equilibrium remains a significant problem in imperfect information games, e.g. StarCraft and poker. Neural Fictitious Self-Play (NFSP) has provided an effective way to learn approximate Nash Equilibrium without prior domain knowledge in imperfect information games. However, optimality gap was left as an optimization problem of NFSP and by solving the problem, the performance of NFSP could be improved. In this study, focusing on the optimality gap of NFSP, we have proposed a new method replacing NFSP's best response computation with regret matching method. The new algorithm can make the optimality gap converge to zero as it iterates, thus converge faster than original NFSP. We have conduct experiments on three typical environments of perfect-information games and imperfect information games in OpenSpiel and all showed that our new algorithm performances better than original NFSP.

翻译：优化深层次学习算法以接近纳什平衡仍然是不完善的信息游戏中的一个重大问题,例如StarCraft 和扑克。神经虚构自我游戏(NFSP)为在不完善的信息游戏中学习近似纳什平衡提供了一种有效的方法。然而,最佳化差距作为NFSP的一个最佳化问题并解决了问题,NFSP的表现可以得到改善。这项研究侧重于NFSP的最佳化差距,我们提出了一个新的方法,用遗憾匹配方法取代NFSP的最佳反应计算方法。新的算法可以使最佳化差距在循环时达到零,从而比原始的NFSP更快地集中。我们在OpenSpiel的三个典型环境中进行了完美信息游戏和不完善的信息游戏的实验,所有实验都表明我们的新算法表现比原化的NFSP要好。

0

相关内容

优化器

[ICML-Google]先宽后窄:对深度薄网络的有效训练

[ICML-Google]先宽后窄:对深度薄网络的有效训练

专知会员服务

36+阅读 · 2020年7月5日

【硬核书】信息论，528页pdf，Information Theory and Coding by Example

【硬核书】信息论，528页pdf，Information Theory and Coding by Example

专知会员服务

148+阅读 · 2020年4月20日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

专知会员服务

16+阅读 · 2019年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

已删除

将门创投

12+阅读 · 2019年7月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

Dynamic Game Theoretic Neural Optimizer

Arxiv

0+阅读 · 2021年6月11日

ZoPE: A Fast Optimizer for ReLU Networks with Low-Dimensional Inputs

Arxiv

0+阅读 · 2021年6月9日

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning

Arxiv

0+阅读 · 2021年6月6日

Regret Minimization Experience Replay

Arxiv

0+阅读 · 2021年6月6日

Learning and Generalization in RNNs

Arxiv

0+阅读 · 2021年5月31日

SparseBERT: Rethinking the Importance Analysis in Self-attention

SparseBERT: Rethinking the Importance Analysis in Self-attention

Arxiv

7+阅读 · 2021年2月25日

Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing

Arxiv

4+阅读 · 2018年11月13日

Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations

Arxiv

9+阅读 · 2018年4月22日

Sim-to-Real Optimization of Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play

Arxiv

4+阅读 · 2018年4月17日

Implementing the Deep Q-Network

Arxiv

3+阅读 · 2017年11月20日

VIP会员

文章信息

相关主题

不完美信息

相关VIP内容

[ICML-Google]先宽后窄:对深度薄网络的有效训练

[ICML-Google]先宽后窄:对深度薄网络的有效训练

专知会员服务

36+阅读 · 2020年7月5日

【硬核书】信息论，528页pdf，Information Theory and Coding by Example

【硬核书】信息论，528页pdf，Information Theory and Coding by Example

专知会员服务

148+阅读 · 2020年4月20日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

专知会员服务

16+阅读 · 2019年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

已删除

将门创投

12+阅读 · 2019年7月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

相关论文

Dynamic Game Theoretic Neural Optimizer

Arxiv

0+阅读 · 2021年6月11日

ZoPE: A Fast Optimizer for ReLU Networks with Low-Dimensional Inputs

Arxiv

0+阅读 · 2021年6月9日

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning

Arxiv

0+阅读 · 2021年6月6日

Regret Minimization Experience Replay

Arxiv

0+阅读 · 2021年6月6日

Learning and Generalization in RNNs

Arxiv

0+阅读 · 2021年5月31日

SparseBERT: Rethinking the Importance Analysis in Self-attention

SparseBERT: Rethinking the Importance Analysis in Self-attention

Arxiv

7+阅读 · 2021年2月25日

Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing

Arxiv

4+阅读 · 2018年11月13日

Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations

Arxiv

9+阅读 · 2018年4月22日

Sim-to-Real Optimization of Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play

Arxiv

4+阅读 · 2018年4月17日

Implementing the Deep Q-Network

Arxiv

3+阅读 · 2017年11月20日

微信扫码咨询专知VIP会员