以完美信息蒸馏为主的杜迪祖(DouDizhu) (PerfectDou: Dominating DouDizhu with Perfect Information Distillation) - 专知论文

会员服务 ·

0

INFORMS · 不完美信息 · 蒸馏 · state-of-the-art · 估计/估计量 ·

2022 年 5 月 13 日

PerfectDou: Dominating DouDizhu with Perfect Information Distillation

翻译：以完美信息蒸馏为主的杜迪祖(DouDizhu)

Guan Yang,Minghuan Liu,Weijun Hong,Weinan Zhang,Fei Fang,Guangjun Zeng,Yue Lin

from arxiv, 15 pages, 12 figures, 11 tables. The first two authors contribute equally

As a challenging multi-player card game, DouDizhu has recently drawn much attention for analyzing competition and collaboration in imperfect-information games. In this paper, we propose PerfectDou, a state-of-the-art DouDizhu AI system that dominates the game, in an actor-critic framework with a proposed technique named perfect information distillation. In detail, we adopt a perfect-training-imperfect-execution framework that allows the agents to utilize the global information to guide the training of the policies as if it is a perfect information game and the trained policies can be used to play the imperfect information game during the actual gameplay. To this end, we characterize card and game features for DouDizhu to represent the perfect and imperfect information. To train our system, we adopt proximal policy optimization with generalized advantage estimation in a parallel training paradigm. In experiments we show how and why PerfectDou beats all existing AI programs, and achieves state-of-the-art performance.

翻译：作为具有挑战性的多玩牌游戏,DouDizhu最近在分析不完善信息游戏的竞争和协作方面引起了人们的极大关注。在本文中,我们建议完美杜(PetrodDou),这是一个控制游戏的最先进的杜杜(DouDizhu AI)系统,在演员-批评框架内,采用一个名为完美信息蒸馏的拟议技术。详细来说,我们采用了一个完美的培训-不完善执行框架,使代理商能够利用全球信息来指导政策培训,仿佛它是一个完美的信息游戏,训练有素的政策可以用来在实际游戏中玩不完善的信息游戏。为此,我们给杜杜朱(DouDizhu)的卡片和游戏特征定性,以代表完美和不完善的信息。为了培训我们的系统,我们采用了在平行培训模式中普遍优势估算的准政策优化。在实验中,我们展示了完美杜如何和为什么将所有现有的人工智能程序打倒,并实现最先进的业绩。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

RFeO3型正铁氧体中多铁性与自旋重取向特性的中子散射研究

国家自然科学基金

0+阅读 · 2015年12月31日

高冲击韧性Cu-Ni合金低温微观结构演变原位研究

国家自然科学基金

0+阅读 · 2013年12月31日

嵌段共聚物逐层自组装制备三维有序CO2纳米通道

国家自然科学基金

0+阅读 · 2012年12月31日

Fe、Co、Ni超细纳米结构制备与催化放氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

A Time Series Forecasting Approach to Minimize Cold Start Time in Cloud-Serverless Platform

A Time Series Forecasting Approach to Minimize Cold Start Time in Cloud-Serverless Platform

Arxiv

0+阅读 · 2022年6月30日

Learnable Model-Driven Performance Prediction and Optimization for Imperfect MIMO System: Framework and Application

Arxiv

0+阅读 · 2022年6月30日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Learning with Heterogeneous Side Information Fusion for Recommender Systems

Arxiv

10+阅读 · 2018年1月8日

VIP会员

文章信息

相关主题

不完美信息

state-of-the-art

估计/估计量

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【ACML2025教程】迈向鲁棒且可信的大语言模型：问题与缓解策略

《利用人工智能改善军事警察行动：当下现状探索》最新95页报告

Google《AI智能体企业应用手册报告》，46页pdf

面向现代武装力量的高级AI驱动军事模拟与训练软件

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

相关论文

A Time Series Forecasting Approach to Minimize Cold Start Time in Cloud-Serverless Platform

A Time Series Forecasting Approach to Minimize Cold Start Time in Cloud-Serverless Platform

Arxiv

0+阅读 · 2022年6月30日

Learnable Model-Driven Performance Prediction and Optimization for Imperfect MIMO System: Framework and Application

Arxiv

0+阅读 · 2022年6月30日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Learning with Heterogeneous Side Information Fusion for Recommender Systems

Arxiv

10+阅读 · 2018年1月8日

相关基金

RFeO3型正铁氧体中多铁性与自旋重取向特性的中子散射研究

国家自然科学基金

0+阅读 · 2015年12月31日

高冲击韧性Cu-Ni合金低温微观结构演变原位研究

国家自然科学基金

0+阅读 · 2013年12月31日

嵌段共聚物逐层自组装制备三维有序CO2纳米通道

国家自然科学基金

0+阅读 · 2012年12月31日

Fe、Co、Ni超细纳米结构制备与催化放氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员