CGIBNet:多机构强化学习中与图表信息瓶颈的带宽通信受限制 (CGIBNet: Bandwidth-constrained Communication with Graph Information Bottleneck in Multi-Agent Reinforcement Learning) - 专知论文

会员服务 ·

0

INFORMS · Learning · Agent · 图 · Extensibility ·

2022 年 6 月 10 日

CGIBNet: Bandwidth-constrained Communication with Graph Information Bottleneck in Multi-Agent Reinforcement Learning

翻译：CGIBNet:多机构强化学习中与图表信息瓶颈的带宽通信受限制

Qi Tian,Kun Kuang,Baoxiang Wang,Furui Liu,Fei Wu

Communication is one of the core components for cooperative multi-agent reinforcement learning (MARL). The communication bandwidth, in many real applications, is always subject to certain constraints. To improve communication efficiency, in this article, we propose to simultaneously optimize whom to communicate with and what to communicate for each agent in MARL. By initiating the communication between agents with a directed complete graph, we propose a novel communication model, named Communicative Graph Information Bottleneck Network (CGIBNet), to simultaneously compress the graph structure and the node information with the graph information bottleneck principle. The graph structure compression is designed to cut the redundant edges for determining whom to communicate with. The node information compression aims to address the problem of what to communicate via learning compact node representations. Moreover, CGIBNet is the first universal module for bandwidth-constrained communication, which can be applied to various training frameworks (i.e., policy-based and value-based MARL frameworks) and communication modes (i.e., single-round and multi-round communication). Extensive experiments are conducted in Traffic Control and StarCraft II environments. The results indicate that our method can achieve better performance in bandwidth-constrained settings compared with state-of-the-art algorithms, especially for large-scale multi-agent tasks.

翻译：在许多实际应用中,通信带宽始终受到某些限制。为了提高通信效率,在本条中,我们提议同时优化与MARL中每个代理商的通信和通信方式。通过在代理商之间启动具有定向完整图解的通信,我们提议了一个新型通信模式,名为通信图信息瓶颈网络(CGIBNet),以同时压缩图形结构和节点信息与图形信息瓶颈原则。图形结构压缩旨在切断确定与谁沟通的冗余边缘。节点信息压缩的目的是解决通过学习紧凑节点演示进行沟通的问题。此外,CGIBNet是带宽限制通信的第一个通用模块,可用于各种培训框架(即政策基框架和以价值为基础的MARL框架)和通信模式(即单轮和多轮通信原则)。在交通控制与StarCraft II环境中进行了广泛的实验。结果显示,我们的方法能够与州级、尤其是州级、级、级、高级、高级、高级、高级、高级、高级的演算系统任务实现更好的业绩。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

变厚度薄壁结构吸能特性研究与优化设计

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

介孔复合微纳结构CaTi2O5的可控制备及光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

四阶微分方程的谱和谱元方法

国家自然科学基金

0+阅读 · 2014年12月31日

亲水性氨基酸离子液体吸收CO2的传质-反应机理

国家自然科学基金

0+阅读 · 2014年12月31日

中央空调系统真菌颗粒物繁殖扩散机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

活性金属在离子液体中的阳极行为及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAF1在心肌梗死后心室重构中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Bias-correction and Test for Mark-point Dependence with Replicated Marked Point Processes

Arxiv

0+阅读 · 2022年7月25日

Federated Graph Contrastive Learning

Arxiv

0+阅读 · 2022年7月24日

Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

Arxiv

0+阅读 · 2022年7月22日

Reinforcement Learning Approaches for the Orienteering Problem with Stochastic and Dynamic Release Dates

Arxiv

0+阅读 · 2022年7月22日

Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM

Arxiv

0+阅读 · 2022年7月22日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Graph Structure Learning with Variational Information Bottleneck

Arxiv

11+阅读 · 2021年12月16日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Bias-correction and Test for Mark-point Dependence with Replicated Marked Point Processes

Arxiv

0+阅读 · 2022年7月25日

Federated Graph Contrastive Learning

Arxiv

0+阅读 · 2022年7月24日

Learn Continuously, Act Discretely: Hybrid Action-Space Reinforcement Learning For Optimal Execution

Arxiv

0+阅读 · 2022年7月22日

Reinforcement Learning Approaches for the Orienteering Problem with Stochastic and Dynamic Release Dates

Arxiv

0+阅读 · 2022年7月22日

Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM

Arxiv

0+阅读 · 2022年7月22日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Graph Structure Learning with Variational Information Bottleneck

Arxiv

11+阅读 · 2021年12月16日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

相关基金

变厚度薄壁结构吸能特性研究与优化设计

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

介孔复合微纳结构CaTi2O5的可控制备及光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

四阶微分方程的谱和谱元方法

国家自然科学基金

0+阅读 · 2014年12月31日

亲水性氨基酸离子液体吸收CO2的传质-反应机理

国家自然科学基金

0+阅读 · 2014年12月31日

中央空调系统真菌颗粒物繁殖扩散机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

活性金属在离子液体中的阳极行为及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAF1在心肌梗死后心室重构中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员