PMIC: 利用渐进式互信合作改进多机构强化学习 (PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration) - 专知论文

会员服务 ·

0

Learning · INFORMS · 互信息 · Better · 强化学习 ·

2022 年 6 月 21 日

PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

翻译：PMIC: 利用渐进式互信合作改进多机构强化学习

Pengyi Li,Hongyao Tang,Tianpei Yang,Xiaotian Hao,Tong Sang,Yan Zheng,Jianye Hao,Matthew E. Taylor,Wenyuan Tao,Zhen Wang

from arxiv, The paper has been accepted by The Thirty-ninth International Conference on Machine Learning (ICML 2022) and the Cooperative AI Workshop at 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

Learning to collaborate is critical in Multi-Agent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. PMIC uses a new collaboration criterion measured by the MI between global states and joint actions. Based on this criterion, the key idea of PMIC is maximizing the MI associated with superior collaborative behaviors and minimizing the MI associated with inferior ones. The two MI objectives play complementary roles by facilitating better collaborations while avoiding falling into sub-optimal ones. Experiments on a wide range of MARL benchmarks show the superior performance of PMIC compared with other algorithms.

翻译：在多机构强化学习(MARL)中,学习协作至关重要。以往的工作通过最大限度地提高代理人行为(通常以不同形式以相互信息为特征)的相互关系促进协作。然而,我们揭示出与强烈关联相关的次最佳合作行为。仅仅最大限度地扩大管理管理可以令人惊讶地阻碍学习更好的合作。为解决这一问题,我们提议了一个名为进步相互信息协作(PMIC)的新型MARL框架,以更有效地开展MI驱动的合作。 PMI使用由管理所测量的全球国家间新合作标准以及联合行动。根据这一标准,PMIC的关键理念是最大限度地扩大与高级合作行为相关的管理,并最大限度地减少与低级合作行为相关的管理。两项管理目标通过促进更好的合作而避免陷入亚优合作而发挥互补作用。对广泛的MARL基准的实验表明PMIC与其他算法相比,PMIC业绩优于其他算法。

0

相关内容

Learning

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

ZnS-CuInS2-AgInS2固溶体纳米晶/MoS2复合物的结构调控及光催化产氢性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

单原子填充方钴矿热电材料微观力学行为的分子动力学模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

镁-铝-稀土合金中Al-RE金属间相稳定性及其对高温蠕变行为的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

以优化类石墨烯VIB族过渡金属硫属化合物的光电性质为目标的材料设计研究

国家自然科学基金

0+阅读 · 2013年12月31日

NG2细胞在未成熟脑惊厥性脑损伤神经环路形成中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Cf/SiC复合材料与钛合金复合扩散钎焊动力学与界面反应研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于兰州HIRFL－CSR内外靶实验飞行时间探测器的多气隙电阻板室研制

国家自然科学基金

0+阅读 · 2009年12月31日

非局部时滞扩散系统的行波解和整体解

国家自然科学基金

0+阅读 · 2008年12月31日

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Arxiv

0+阅读 · 2022年8月9日

On the Importance of Critical Period in Multi-stage Reinforcement Learning

On the Importance of Critical Period in Multi-stage Reinforcement Learning

Arxiv

0+阅读 · 2022年8月9日

Peer Prediction for Learning Agents

Arxiv

0+阅读 · 2022年8月8日

Autonomous Reinforcement Learning: Formalism and Benchmarking

Arxiv

0+阅读 · 2022年8月8日

Multi-agent reinforcement learning for intent-based service assurance in cellular networks

Arxiv

0+阅读 · 2022年8月7日

Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年8月7日

Cooperative Reinforcement Learning on Traffic Signal Control

Arxiv

0+阅读 · 2022年8月6日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Arxiv

0+阅读 · 2022年8月9日

On the Importance of Critical Period in Multi-stage Reinforcement Learning

On the Importance of Critical Period in Multi-stage Reinforcement Learning

Arxiv

0+阅读 · 2022年8月9日

Peer Prediction for Learning Agents

Arxiv

0+阅读 · 2022年8月8日

Autonomous Reinforcement Learning: Formalism and Benchmarking

Arxiv

0+阅读 · 2022年8月8日

Multi-agent reinforcement learning for intent-based service assurance in cellular networks

Arxiv

0+阅读 · 2022年8月7日

Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年8月7日

Cooperative Reinforcement Learning on Traffic Signal Control

Arxiv

0+阅读 · 2022年8月6日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

ZnS-CuInS2-AgInS2固溶体纳米晶/MoS2复合物的结构调控及光催化产氢性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

单原子填充方钴矿热电材料微观力学行为的分子动力学模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

镁-铝-稀土合金中Al-RE金属间相稳定性及其对高温蠕变行为的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

以优化类石墨烯VIB族过渡金属硫属化合物的光电性质为目标的材料设计研究

国家自然科学基金

0+阅读 · 2013年12月31日

NG2细胞在未成熟脑惊厥性脑损伤神经环路形成中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Cf/SiC复合材料与钛合金复合扩散钎焊动力学与界面反应研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于兰州HIRFL－CSR内外靶实验飞行时间探测器的多气隙电阻板室研制

国家自然科学基金

0+阅读 · 2009年12月31日

非局部时滞扩散系统的行波解和整体解

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员