基于结构信息原则的有效稳定角色分配多智能体协作 (Effective and Stable Role-Based Multi-Agent Collaboration by Structural Information Principles) - 专知论文

会员服务 ·

0

多智能体 · 智能体 · 结构 · 协作 · SR ·

2023 年 4 月 3 日

Effective and Stable Role-Based Multi-Agent Collaboration by Structural Information Principles

翻译：基于结构信息原则的有效稳定角色分配多智能体协作

Xianghua Zeng,Hao Peng,Angsheng Li

from arxiv, 9 pages, 8 figures,2 references

Role-based learning is a promising approach to improving the performance of Multi-Agent Reinforcement Learning (MARL). Nevertheless, without manual assistance, current role-based methods cannot guarantee stably discovering a set of roles to effectively decompose a complex task, as they assume either a predefined role structure or practical experience for selecting hyperparameters. In this article, we propose a mathematical Structural Information principles-based Role Discovery method, namely SIRD, and then present a SIRD optimizing MARL framework, namely SR-MARL, for multi-agent collaboration. The SIRD transforms role discovery into a hierarchical action space clustering. Specifically, the SIRD consists of structuralization, sparsification, and optimization modules, where an optimal encoding tree is generated to perform abstracting to discover roles. The SIRD is agnostic to specific MARL algorithms and flexibly integrated with various value function factorization approaches. Empirical evaluations on the StarCraft II micromanagement benchmark demonstrate that, compared with state-of-the-art MARL algorithms, the SR-MARL framework improves the average test win rate by 0.17%, 6.08%, and 3.24%, and reduces the deviation by 16.67%, 30.80%, and 66.30%, under easy, hard, and super hard scenarios.

翻译：基于角色的学习是改善多智能体强化学习性能的一种有前途的方法。然而，当前的基于角色的方法在没有手动辅助的情况下，不能保证稳定地发现一组角色来有效地分解复杂的任务，因为它们要么假设预定义的角色结构，要么假设通过实际经验选择超参数。在本文中，我们提出了一种基于数学结构信息原则的角色发现方法，即SIRD，并提出了一种优化MARL框架，即SR-MARL，用于多智能体协作。SIRD将角色发现转化为一种分层的行动空间聚类。具体来说，SIRD包括结构化、稀疏化和优化模块，其中生成一个最优的编码树来执行抽象以发现角色。SIRD对特定的MARL算法不感知，并灵活地集成各种值函数分解方法。基于StarCraft II微观管理基准的实证评估表明，与最先进的MARL算法相比，SR-MARL框架改善了易、困难和超难的场景下的平均测试胜率分别为0.17％、6.08％和3.24％，并减少了分别为16.67％、30.80％和66.30％的偏差。

0

相关内容

多智能体

【AAAI2023论文解读】结构信息原理指导的基于角色发现的高效稳定多智能体协作

【AAAI2023论文解读】结构信息原理指导的基于角色发现的高效稳定多智能体协作

专知会员服务

28+阅读 · 2023年5月24日

【Alex Nowak-Vila博士论文】有理论保证的结构化预测， Structured Prediction with Theoretical Guarantees

【Alex Nowak-Vila博士论文】有理论保证的结构化预测， Structured Prediction with Theoretical Guarantees

专知会员服务

13+阅读 · 2022年3月15日

【干货书】开放数据结构，Open Data Structures，337页pdf

【干货书】开放数据结构，Open Data Structures，337页pdf

专知会员服务

17+阅读 · 2021年9月17日

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

专知会员服务

20+阅读 · 2020年6月11日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

【SIGMOD2020】一个全面的主动学习方法的实体匹配基准框架，A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

【SIGMOD2020】一个全面的主动学习方法的实体匹配基准框架，A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

专知会员服务

24+阅读 · 2020年3月31日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【医学图像分割| 2019新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍（Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications），附35页PDF

【医学图像分割| 2019新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍（Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications），附35页PDF

专知会员服务

57+阅读 · 2019年11月23日

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

专知会员服务

53+阅读 · 2019年11月22日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

铝合金在形变过程中强化相Al2Cu分解机制的像差校正电子显微学研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于凸优化理论的特征点匹配算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于调幅分解与有序化的Fe-24Al-X三元合金纳米组织设计与控制

国家自然科学基金

0+阅读 · 2013年12月31日

应用代谢组学方法研究重症急性胰腺炎继发MOF的早期预警机制

国家自然科学基金

0+阅读 · 2013年12月31日

波导耦合波理论反演问题的迭代求解方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

信息物理电力系统耦合网络动态的分解协调仿真方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于类别结构信息和结构化学习的维数约简

国家自然科学基金

0+阅读 · 2011年12月31日

基于数据挖掘和复杂网络的UML类图复杂性度量研究

国家自然科学基金

0+阅读 · 2011年12月31日

网络组织结构、治理机制对协作创新的影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于自适应学习的农业领域本体建模理论与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Variational Inference with Coverage Guarantees

Arxiv

0+阅读 · 2023年5月23日

AdaMS: Deep Metric Learning with Adaptive Margin and Adaptive Scale for Acoustic Word Discrimination

Arxiv

0+阅读 · 2023年5月23日

Deep Clustering for Data Cleaning and Integration

Arxiv

0+阅读 · 2023年5月22日

UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective

Arxiv

0+阅读 · 2023年5月22日

Multi-Objective Optimization Using the R2 Utility

Arxiv

0+阅读 · 2023年5月19日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Arxiv

11+阅读 · 2021年3月14日

Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction

Arxiv

18+阅读 · 2019年12月25日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation

Arxiv

11+阅读 · 2018年5月9日

Attention-based Ensemble for Deep Metric Learning

Arxiv

17+阅读 · 2018年4月2日

VIP会员

文章信息

相关主题

相关VIP内容

【AAAI2023论文解读】结构信息原理指导的基于角色发现的高效稳定多智能体协作

【AAAI2023论文解读】结构信息原理指导的基于角色发现的高效稳定多智能体协作

专知会员服务

28+阅读 · 2023年5月24日

【Alex Nowak-Vila博士论文】有理论保证的结构化预测， Structured Prediction with Theoretical Guarantees

【Alex Nowak-Vila博士论文】有理论保证的结构化预测， Structured Prediction with Theoretical Guarantees

专知会员服务

13+阅读 · 2022年3月15日

【干货书】开放数据结构，Open Data Structures，337页pdf

【干货书】开放数据结构，Open Data Structures，337页pdf

专知会员服务

17+阅读 · 2021年9月17日

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

【KDD2020】CAST:一种基于相关关系的多尺度数据自适应光谱聚类算法,CAST: A Correlation-based Adaptive Spectral Clustering Algorithm on Multi-scale Data

专知会员服务

20+阅读 · 2020年6月11日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

【SIGMOD2020】一个全面的主动学习方法的实体匹配基准框架，A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

【SIGMOD2020】一个全面的主动学习方法的实体匹配基准框架，A Comprehensive Benchmark Framework for Active Learning Methods in Entity Matching

专知会员服务

24+阅读 · 2020年3月31日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【医学图像分割| 2019新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍（Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications），附35页PDF

【医学图像分割| 2019新综述】生物医学图像分割的机器学习技术：技术方面综述和最新应用介绍（Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications），附35页PDF

专知会员服务

57+阅读 · 2019年11月23日

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

【AAAI2020】实体关系联合抽取的编码器-解码器结构的有效建模（ Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction）

专知会员服务

53+阅读 · 2019年11月22日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Variational Inference with Coverage Guarantees

Arxiv

0+阅读 · 2023年5月23日

AdaMS: Deep Metric Learning with Adaptive Margin and Adaptive Scale for Acoustic Word Discrimination

Arxiv

0+阅读 · 2023年5月23日

Deep Clustering for Data Cleaning and Integration

Arxiv

0+阅读 · 2023年5月22日

UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective

Arxiv

0+阅读 · 2023年5月22日

Multi-Objective Optimization Using the R2 Utility

Arxiv

0+阅读 · 2023年5月19日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Arxiv

11+阅读 · 2021年3月14日

Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction

Arxiv

18+阅读 · 2019年12月25日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation

Arxiv

11+阅读 · 2018年5月9日

Attention-based Ensemble for Deep Metric Learning

Arxiv

17+阅读 · 2018年4月2日

相关基金

铝合金在形变过程中强化相Al2Cu分解机制的像差校正电子显微学研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于凸优化理论的特征点匹配算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于调幅分解与有序化的Fe-24Al-X三元合金纳米组织设计与控制

国家自然科学基金

0+阅读 · 2013年12月31日

应用代谢组学方法研究重症急性胰腺炎继发MOF的早期预警机制

国家自然科学基金

0+阅读 · 2013年12月31日

波导耦合波理论反演问题的迭代求解方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

信息物理电力系统耦合网络动态的分解协调仿真方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于类别结构信息和结构化学习的维数约简

国家自然科学基金

0+阅读 · 2011年12月31日

基于数据挖掘和复杂网络的UML类图复杂性度量研究

国家自然科学基金

0+阅读 · 2011年12月31日

网络组织结构、治理机制对协作创新的影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于自适应学习的农业领域本体建模理论与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员