通过通信作出多代理代理序列决策 (Multi-Agent Sequential Decision-Making via Communication) - 专知论文

会员服务 ·

0

Agent · Better · INFORMS · CLUES · 隐状态 ·

2022 年 9 月 26 日

Multi-Agent Sequential Decision-Making via Communication

翻译：通过通信作出多代理代理序列决策

Ziluo Ding,Kefan Su,Weixin Hong,Liwen Zhu,Tiejun Huang,Zongqing Lu

from arxiv, 20 pages

Communication helps agents to obtain information about others so that better coordinated behavior can be learned. Some existing work communicates predicted future trajectory with others, hoping to get clues about what others would do for better coordination. However, circular dependencies sometimes can occur when agents are treated synchronously so it is hard to coordinate decision-making. In this paper, we propose a novel communication scheme, Sequential Communication (SeqComm). SeqComm treats agents asynchronously (the upper-level agents make decisions before the lower-level ones) and has two communication phases. In negotiation phase, agents determine the priority of decision-making by communicating hidden states of observations and comparing the value of intention, which is obtained by modeling the environment dynamics. In launching phase, the upper-level agents take the lead in making decisions and communicate their actions with the lower-level agents. Theoretically, we prove the policies learned by SeqComm are guaranteed to improve monotonically and converge. Empirically, we show that SeqComm outperforms existing methods in various multi-agent cooperative tasks.

翻译：通信帮助代理商获得关于他人的信息,以便学习更好的协调行为。有些现有工作与其他人交流了预测的未来轨迹,希望获得关于其他人会如何改进协调的线索。然而,当代理商得到同步处理从而难以协调决策时,有时会出现循环依赖性。在本文件中,我们提出了一个新的通信计划,即序列通信(SeqComm),SeqComm治疗代理商不时同步地(高层代理商在较低级别之前作出决定),并有两个沟通阶段。在谈判阶段,代理商通过传递隐藏的观察状态和比较意图的价值来确定决策的优先顺序,而意图的价值是通过模拟环境动态获得的。在启动阶段,高层代理商带头作出决定,并与较低级别的代理商交流行动。理论上,我们证明SeqCommerc公司所学的政策保证能够改进单一性和趋同性。在各种多代理合作任务中,我们证明SeqCommerc公司超越了现有方法。

0

相关内容

Agent

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

同型半胱氨酸经ERK通路上调ETB受体表达促血管平滑肌细胞增殖机制

国家自然科学基金

0+阅读 · 2015年12月31日

海洋一次波与多次波联合最小二乘逆时偏移

国家自然科学基金

1+阅读 · 2015年12月31日

同型半胱氨酸经组蛋白和DNA甲基化相互作用调控ERO1α促内质网应激的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

Cell-in-cell介导非易感细胞病毒感染及其免疫逃逸机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于Cre/loxP系统的肝特异性表达REGγ转基因小鼠的建立及脂质代谢分析

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Caveolin-1介导多能血管干细胞向增殖型血管平滑肌细胞分化的调节作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA-29b介导血管平滑肌细胞AT1aR基因DNA去甲基化参与高血压发病机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

运动激活骨骼肌蛋白质降解途径的信号转导研究

国家自然科学基金

0+阅读 · 2009年12月31日

Interval Markov Decision Processes with Continuous Action-Spaces

Arxiv

0+阅读 · 2022年11月2日

Digital Human Interactive Recommendation Decision-Making Based on Reinforcement Learning

Digital Human Interactive Recommendation Decision-Making Based on Reinforcement Learning

Arxiv

0+阅读 · 2022年11月2日

Neuromorphic Twins for Networked Control and Decision-Making

Arxiv

0+阅读 · 2022年11月1日

Multi-Resource Allocation for On-Device Distributed Federated Learning Systems

Arxiv

0+阅读 · 2022年11月1日

Analog Twin Framework for Human and AI Supervisory Control and Teleoperation of Robots

Arxiv

0+阅读 · 2022年10月29日

Global Optimization of Energy Efficiency in IRS-Aided Communication Systems via Robust IRS-Element Activation

Arxiv

0+阅读 · 2022年10月29日

Upfront Commitment in Online Resource Allocation with Patient Customers

Arxiv

0+阅读 · 2022年10月29日

A Multilevel Reinforcement Learning Framework for PDE-based Control

Arxiv

0+阅读 · 2022年10月28日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《科研智能：人工智能赋能工业仿真研究报告（2025年）》

具身智能中的世界模型：全面综述

【NeurIPS2025】迈向开放世界的三维“物体性”学习

【博士论文】用于排序与扩散模型的安全、高效与鲁棒强化学习

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Interval Markov Decision Processes with Continuous Action-Spaces

Arxiv

0+阅读 · 2022年11月2日

Digital Human Interactive Recommendation Decision-Making Based on Reinforcement Learning

Digital Human Interactive Recommendation Decision-Making Based on Reinforcement Learning

Arxiv

0+阅读 · 2022年11月2日

Neuromorphic Twins for Networked Control and Decision-Making

Arxiv

0+阅读 · 2022年11月1日

Multi-Resource Allocation for On-Device Distributed Federated Learning Systems

Arxiv

0+阅读 · 2022年11月1日

Analog Twin Framework for Human and AI Supervisory Control and Teleoperation of Robots

Arxiv

0+阅读 · 2022年10月29日

Global Optimization of Energy Efficiency in IRS-Aided Communication Systems via Robust IRS-Element Activation

Arxiv

0+阅读 · 2022年10月29日

Upfront Commitment in Online Resource Allocation with Patient Customers

Arxiv

0+阅读 · 2022年10月29日

A Multilevel Reinforcement Learning Framework for PDE-based Control

Arxiv

0+阅读 · 2022年10月28日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

相关基金

同型半胱氨酸经ERK通路上调ETB受体表达促血管平滑肌细胞增殖机制

国家自然科学基金

0+阅读 · 2015年12月31日

海洋一次波与多次波联合最小二乘逆时偏移

国家自然科学基金

1+阅读 · 2015年12月31日

同型半胱氨酸经组蛋白和DNA甲基化相互作用调控ERO1α促内质网应激的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

Cell-in-cell介导非易感细胞病毒感染及其免疫逃逸机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于Cre/loxP系统的肝特异性表达REGγ转基因小鼠的建立及脂质代谢分析

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Caveolin-1介导多能血管干细胞向增殖型血管平滑肌细胞分化的调节作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA-29b介导血管平滑肌细胞AT1aR基因DNA去甲基化参与高血压发病机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

运动激活骨骼肌蛋白质降解途径的信号转导研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员