真实的自我扮演 (Truthful Self-Play) - 专知论文

会员服务 ·

0

Self-Play · 回合 · Agent · 无偏 · 知识 (knowledge) ·

2022 年 10 月 5 日

Truthful Self-Play

翻译：真实的自我扮演

We present a general optimization framework for emergent belief-state representation without any supervision. We employed the common configuration of multiagent reinforcement learning and communication to improve exploration coverage over an environment by leveraging the knowledge of each agent. In this paper, we obtained that recurrent neural nets (RNNs) with shared weights are highly biased in partially observable environments because of their noncooperativity. To address this, we designated an unbiased version of self-play via mechanism design, also known as reverse game theory, to clarify unbiased knowledge at the Bayesian Nash equilibrium. The key idea is to add imaginary rewards using the peer prediction mechanism, i.e., a mechanism for mutually criticizing information in a decentralized environment. Numerical analyses, including StarCraft exploration tasks with up to 20 agents and off-the-shelf RNNs, demonstrate the state-of-the-art performance.

翻译：我们为新兴的信仰国家代表提供了一个总体优化框架,没有受到任何监督。我们采用了多试剂强化学习和通信的共同配置,通过利用每个代理人的知识来改善对环境的探索范围。在本文中,我们获得的是,具有共同重量的经常性神经网(RNN)由于不合作性,在部分可观测环境中高度偏颇。为了解决这个问题,我们指定了一个通过机制设计(也称为逆向游戏理论)来无偏见地自我游戏的版本,以澄清巴伊西亚纳什平衡的不偏倚知识。关键的想法是利用同行预测机制,即分散环境中相互批评信息的机制,添加想象中的奖励。数字分析,包括有多达20个代理人和超贴贴贴的RNNS的StarCraft勘探任务,展示了最新表现。

0

相关内容

Self-Play

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

IIB族金属纳米结构气相生长机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

双金属Cu@Ag核壳纳米线透明薄膜电极的制备、改性及导电机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型Ca5A4(VO4)6基陶瓷结构与微波介电性能调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

形状记忆合金微观力学行为的实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

硅基III-V族纳米线选区横向生长及其高迁移率3D晶体管研究

国家自然科学基金

0+阅读 · 2012年12月31日

高浓度液滴群在射流场中碰撞凝并的动力学特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

长非编码RNA BC032469调控胃癌细胞hTERT表达的分子机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

辅助电脉冲低温扩散焊连接Ti(C,N)金属陶瓷与40Cr的机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

宽馏分含氧燃料低温燃烧基础理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

羊痘病毒毒力因子P32和KLP蛋白与胎羊皮肤细胞相互作用的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Structured Recognition for Generative Models with Explaining Away

Arxiv

0+阅读 · 2022年11月10日

Reinforcement Learning in an Adaptable Chess Environment for Detecting Human-understandable Concepts

Arxiv

0+阅读 · 2022年11月10日

Review of Methods for Handling Class-Imbalanced in Classification Problems

Arxiv

0+阅读 · 2022年11月10日

Speech Enhancement with Fullband-Subband Cross-Attention Network

Arxiv

0+阅读 · 2022年11月10日

Robust DNN Watermarking via Fixed Embedding Weights with Optimized Distribution

Arxiv

0+阅读 · 2022年11月10日

Discrimination and Class Imbalance Aware Online Naive Bayes

Discrimination and Class Imbalance Aware Online Naive Bayes

Arxiv

0+阅读 · 2022年11月9日

Lipschitz Continuous Algorithms for Graph Problems

Arxiv

0+阅读 · 2022年11月9日

SPEEDEX: A Scalable, Parallelizable, and Economically Efficient Digital EXchange

Arxiv

0+阅读 · 2022年11月8日

Acquisition of Chess Knowledge in AlphaZero

Arxiv

14+阅读 · 2021年11月27日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Structured Recognition for Generative Models with Explaining Away

Arxiv

0+阅读 · 2022年11月10日

Reinforcement Learning in an Adaptable Chess Environment for Detecting Human-understandable Concepts

Arxiv

0+阅读 · 2022年11月10日

Review of Methods for Handling Class-Imbalanced in Classification Problems

Arxiv

0+阅读 · 2022年11月10日

Speech Enhancement with Fullband-Subband Cross-Attention Network

Arxiv

0+阅读 · 2022年11月10日

Robust DNN Watermarking via Fixed Embedding Weights with Optimized Distribution

Arxiv

0+阅读 · 2022年11月10日

Discrimination and Class Imbalance Aware Online Naive Bayes

Discrimination and Class Imbalance Aware Online Naive Bayes

Arxiv

0+阅读 · 2022年11月9日

Lipschitz Continuous Algorithms for Graph Problems

Arxiv

0+阅读 · 2022年11月9日

SPEEDEX: A Scalable, Parallelizable, and Economically Efficient Digital EXchange

Arxiv

0+阅读 · 2022年11月8日

Acquisition of Chess Knowledge in AlphaZero

Arxiv

14+阅读 · 2021年11月27日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

相关基金

IIB族金属纳米结构气相生长机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

双金属Cu@Ag核壳纳米线透明薄膜电极的制备、改性及导电机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型Ca5A4(VO4)6基陶瓷结构与微波介电性能调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

形状记忆合金微观力学行为的实验研究

国家自然科学基金

0+阅读 · 2013年12月31日

硅基III-V族纳米线选区横向生长及其高迁移率3D晶体管研究

国家自然科学基金

0+阅读 · 2012年12月31日

高浓度液滴群在射流场中碰撞凝并的动力学特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

长非编码RNA BC032469调控胃癌细胞hTERT表达的分子机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

辅助电脉冲低温扩散焊连接Ti(C,N)金属陶瓷与40Cr的机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

宽馏分含氧燃料低温燃烧基础理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

羊痘病毒毒力因子P32和KLP蛋白与胎羊皮肤细胞相互作用的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员