通过最佳和更好的对策评价和学习双层对称运动会 (Evaluation and Learning in Two-Player Symmetric Games via Best and Better Responses) - 专知论文

会员服务 ·

0

学成 · Better · 可辨认的 · Self-Play · 秩 ·

2022 年 4 月 27 日

Evaluation and Learning in Two-Player Symmetric Games via Best and Better Responses

翻译：通过最佳和更好的对策评价和学习双层对称运动会

Rui Yan,Weixian Zhang,Ruiliang Deng,Xiaoming Duan,Zongying Shi,Yisheng Zhong

from arxiv, 11 pages, 6 figures

Artificial intelligence and robotic competitions are accompanied by a class of game paradigms in which each player privately commits a strategy to a game system which simulates the game using the collected joint strategy and then returns payoffs to players. This paper considers the strategy commitment for two-player symmetric games in which the players' strategy spaces are identical and their payoffs are symmetric. First, we introduce two digraph-based metrics at a meta-level for strategy evaluation in two-agent reinforcement learning, grounded on sink equilibrium. The metrics rank the strategies of a single player and determine the set of strategies which are preferred for the private commitment. Then, in order to find the preferred strategies under the metrics, we propose two variants of the classical learning algorithm self-play, called strictly best-response and weakly better-response self-plays. By modeling learning processes as walks over joint-strategy response digraphs, we prove that the learnt strategies by two variants are preferred under two metrics, respectively. The preferred strategies under both two metrics are identified and adjacency matrices induced by one metric and one variant are connected. Finally, simulations are provided to illustrate the results.

翻译：人工智能和机器人竞赛伴随着一系列游戏模式,每个玩家在其中私下对游戏系统做出一项战略,利用收集到的联合战略模拟游戏,然后将报酬回报给玩家。本文审议了玩家战略空间相同和其报酬对称的双玩对称游戏的战略承诺。首先,我们在基于汇平衡的两个试剂强化学习中,为战略评估引入了两个元级的基于字典的衡量标准。衡量标准将单个玩家的战略排在首位,并确定一套适合私人承诺的战略。然后,为了在衡量标准下找到首选的战略,我们提出了两种典型学习算法自我游戏的变式,即严格称最佳反应和反应能力较差的自我游戏。通过在联合战略响应矩阵中行走来模拟学习过程,我们证明在两个衡量标准下分别选择了两个变式的学习战略。两种衡量标准下的首选战略被确定,两个衡量标准下的对应矩阵则由一个计量标准和一个变式驱动。最后,模拟的结果被连接到一个矩阵和一个变式。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

高糖影响肺动脉平滑肌细胞收缩增殖的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

急性肺损伤时HMGB1调控iPS和中性粒细胞竞争性组织归巢的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

随机时滞微分方程解的矩稳定性和有界性

国家自然科学基金

0+阅读 · 2014年12月31日

分数阶偏微分方程的近似算法研究

国家自然科学基金

1+阅读 · 2014年12月31日

电磁散射中的无穷曲面锥形散射问题及其反问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于稀疏贝叶斯方法的THz-SAR振动目标成像研究

国家自然科学基金

1+阅读 · 2013年12月31日

Stat3抑制myocardin诱导心肌肥厚的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

白藜芦醇调节STIM1抑制血管平滑肌细胞增殖机制的探讨

国家自然科学基金

0+阅读 · 2012年12月31日

地下水耦合模型的有限元方法及反演

国家自然科学基金

0+阅读 · 2011年12月31日

SIRT1调控转录因子KLF4影响内皮祖细胞分化的作用及机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

On the Convergence of the Shapley Value in Parametric Bayesian Learning Games

Arxiv

0+阅读 · 2022年6月14日

On the Symmetries of Deep Learning Models and their Internal Representations

Arxiv

0+阅读 · 2022年6月13日

Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games

Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games

Arxiv

0+阅读 · 2022年6月13日

No-Regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity via Learning Rate Separation

Arxiv

0+阅读 · 2022年6月13日

Visual Attention Emerges from Recurrent Sparse Reconstruction

Arxiv

0+阅读 · 2022年6月12日

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Arxiv

0+阅读 · 2022年6月12日

An Algorithm for Exact Numerical Age-of-Information Evaluation in Multi-Agent Systems

Arxiv

1+阅读 · 2022年6月11日

Interactively Learning Preference Constraints in Linear Bandits

Arxiv

0+阅读 · 2022年6月10日

Preference Communication in Multi-Objective Normal-Form Games

Arxiv

0+阅读 · 2022年6月10日

How Much is Enough? A Study on Diffusion Times in Score-based Generative Models

Arxiv

0+阅读 · 2022年6月10日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大型语言模型遇上文本属性图：一种融合框架与应用的综述

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

【博士论文】用于概率程序与生成模型的变分推断

军事指挥控制系统：2025年5种用途

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

On the Convergence of the Shapley Value in Parametric Bayesian Learning Games

Arxiv

0+阅读 · 2022年6月14日

On the Symmetries of Deep Learning Models and their Internal Representations

Arxiv

0+阅读 · 2022年6月13日

Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games

Simplex Neural Population Learning: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games

Arxiv

0+阅读 · 2022年6月13日

No-Regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity via Learning Rate Separation

Arxiv

0+阅读 · 2022年6月13日

Visual Attention Emerges from Recurrent Sparse Reconstruction

Arxiv

0+阅读 · 2022年6月12日

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Arxiv

0+阅读 · 2022年6月12日

An Algorithm for Exact Numerical Age-of-Information Evaluation in Multi-Agent Systems

Arxiv

1+阅读 · 2022年6月11日

Interactively Learning Preference Constraints in Linear Bandits

Arxiv

0+阅读 · 2022年6月10日

Preference Communication in Multi-Objective Normal-Form Games

Arxiv

0+阅读 · 2022年6月10日

How Much is Enough? A Study on Diffusion Times in Score-based Generative Models

Arxiv

0+阅读 · 2022年6月10日

相关基金

高糖影响肺动脉平滑肌细胞收缩增殖的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

急性肺损伤时HMGB1调控iPS和中性粒细胞竞争性组织归巢的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

随机时滞微分方程解的矩稳定性和有界性

国家自然科学基金

0+阅读 · 2014年12月31日

分数阶偏微分方程的近似算法研究

国家自然科学基金

1+阅读 · 2014年12月31日

电磁散射中的无穷曲面锥形散射问题及其反问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于稀疏贝叶斯方法的THz-SAR振动目标成像研究

国家自然科学基金

1+阅读 · 2013年12月31日

Stat3抑制myocardin诱导心肌肥厚的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

白藜芦醇调节STIM1抑制血管平滑肌细胞增殖机制的探讨

国家自然科学基金

0+阅读 · 2012年12月31日

地下水耦合模型的有限元方法及反演

国家自然科学基金

0+阅读 · 2011年12月31日

SIRT1调控转录因子KLF4影响内皮祖细胞分化的作用及机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员