无模型的神经反反事实悔负最小化,加上诱捕学习 (Model-free Neural Counterfactual Regret Minimization with Bootstrap Learning) - 专知论文

会员服务 ·

0

自助法/自举法 · INFORMS · 近似 · 学成 · 可约的 ·

2022 年 4 月 19 日

Model-free Neural Counterfactual Regret Minimization with Bootstrap Learning

翻译：无模型的神经反反事实悔负最小化,加上诱捕学习

Weiming Liu,Bin Li,Julian Togelius

Counterfactual Regret Minimization (CFR) has achieved many fascinating results in solving large-scale Imperfect Information Games (IIGs). Neural network approximation CFR (neural CFR) is one of the promising techniques that can reduce computation and memory consumption by generalizing decision information between similar states. Current neural CFR algorithms have to approximate cumulative regrets. However, efficient and accurate approximation in a large-scale IIG is still a tough challenge. In this paper, a new CFR variant, Recursive CFR (ReCFR), is proposed. In ReCFR, Recursive Substitute Values (RSVs) are learned and used to replace cumulative regrets. It is proven that ReCFR can converge to a Nash equilibrium at a rate of $O({1}/{\sqrt{T}})$. Based on ReCFR, a new model-free neural CFR with bootstrap learning, Neural ReCFR-B, is proposed. Due to the recursive and non-cumulative nature of RSVs, Neural ReCFR-B has lower-variance training targets than other neural CFRs. Experimental results show that Neural ReCFR-B is competitive with the state-of-the-art neural CFR algorithms at a much lower training cost.

翻译：在解决大规模低效信息游戏(IIG)方面,反事实最小化(CFR)取得了许多令人着迷的成果。神经网络近似CFR(Neal CFR)是通过推广类似州之间的决策信息而减少计算和记忆消耗的有希望的技术之一。当前的神经CFR算法必须大致累积遗憾。然而,在大规模IGG中,高效和准确接近仍然是一个艰巨的挑战。在本文中,提出了一个新的CFR变方,Recursive CFR(Recurive CFR)(ReCFR),在RecFR(RE)中,Recurs Recurive 替代值(RSV)被学习并用来取代累积的遗憾。事实证明,RECFR可以以美元({1}/sqrt{T ⁇ )的速率趋同纳什平衡。在REFR的基础上,提出了一个新的无型CFRRFR(NR)和NFR(NAFR)下级培训结果显示,在CRFR(CRRR)下调低调,在CRAFRA中,在CRRRRADRRA中,其低调调低的训练是其他目标。

0

相关内容

自助法/自举法

自助法/自举法

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

NDRG2介导泛素化蛋白降解途径在抑制HER2阳性乳腺癌耐药中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

蛋白质相互作用及结合位点的预测方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

用正电子湮灭研究电子俄歇复合机制

国家自然科学基金

0+阅读 · 2013年12月31日

介尺度磁性复合囊泡状结构材料的可控构筑及性能

国家自然科学基金

0+阅读 · 2012年12月31日

新型核-壳介孔微球的结构调控及惰性气体选择性富集机理

国家自然科学基金

0+阅读 · 2012年12月31日

从内质网应激介导的CHOP凋亡途径探讨BPD发生机制

国家自然科学基金

0+阅读 · 2012年12月31日

蒽醌/石墨烯纳米复合材料电极的电催化氧还原性能及其在异相electro-Fenton-like体系中的应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

高同型半胱氨酸引起蛋白质硝基化的鉴定及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

细菌FimH蛋白在抗LAMP-2抗体相关性寡免疫复合物新月体肾炎中的分子模拟诱导机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Accelerating Score-based Generative Models for High-Resolution Image Synthesis

Arxiv

0+阅读 · 2022年6月9日

CFA: Coupled-hypersphere-based Feature Adaptation for Target-Oriented Anomaly Localization

CFA: Coupled-hypersphere-based Feature Adaptation for Target-Oriented Anomaly Localization

Arxiv

0+阅读 · 2022年6月9日

Model-Based Reinforcement Learning Is Minimax-Optimal for Offline Zero-Sum Markov Games

Arxiv

0+阅读 · 2022年6月8日

Hierarchical Similarity Learning for Aliasing Suppression Image Super-Resolution

Arxiv

0+阅读 · 2022年6月7日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

自助法/自举法

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军徒步机动作战条令手册》最新168页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

军事后勤数字化未来展望

《美海军后勤体系整合与创新挑战》最新报告

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

Accelerating Score-based Generative Models for High-Resolution Image Synthesis

Arxiv

0+阅读 · 2022年6月9日

CFA: Coupled-hypersphere-based Feature Adaptation for Target-Oriented Anomaly Localization

CFA: Coupled-hypersphere-based Feature Adaptation for Target-Oriented Anomaly Localization

Arxiv

0+阅读 · 2022年6月9日

Model-Based Reinforcement Learning Is Minimax-Optimal for Offline Zero-Sum Markov Games

Arxiv

0+阅读 · 2022年6月8日

Hierarchical Similarity Learning for Aliasing Suppression Image Super-Resolution

Arxiv

0+阅读 · 2022年6月7日

Counterfactual Explanations for Machine Learning: A Review

Arxiv

25+阅读 · 2020年10月20日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

相关基金

NDRG2介导泛素化蛋白降解途径在抑制HER2阳性乳腺癌耐药中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

蛋白质相互作用及结合位点的预测方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

用正电子湮灭研究电子俄歇复合机制

国家自然科学基金

0+阅读 · 2013年12月31日

介尺度磁性复合囊泡状结构材料的可控构筑及性能

国家自然科学基金

0+阅读 · 2012年12月31日

新型核-壳介孔微球的结构调控及惰性气体选择性富集机理

国家自然科学基金

0+阅读 · 2012年12月31日

从内质网应激介导的CHOP凋亡途径探讨BPD发生机制

国家自然科学基金

0+阅读 · 2012年12月31日

蒽醌/石墨烯纳米复合材料电极的电催化氧还原性能及其在异相electro-Fenton-like体系中的应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

高同型半胱氨酸引起蛋白质硝基化的鉴定及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

细菌FimH蛋白在抗LAMP-2抗体相关性寡免疫复合物新月体肾炎中的分子模拟诱导机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员