Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning - 专知论文

会员服务 ·

0

Learning · 稀疏 · motivation · state-of-the-art · Performer ·

2023 年 5 月 18 日

Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning

翻译：暂无翻译

Wenhao Li,Dan Qiao,Baoxiang Wang,Xiangfeng Wang,Bo Jin,Hongyuan Zha

from arxiv, 39 pages, 10 figures

The difficulty of appropriately assigning credit is particularly heightened in cooperative MARL with sparse reward, due to the concurrent time and structural scales involved. Automatic subgoal generation (ASG) has recently emerged as a viable MARL approach inspired by utilizing subgoals in intrinsically motivated reinforcement learning. However, end-to-end learning of complex task planning from sparse rewards without prior knowledge, undoubtedly requires massive training samples. Moreover, the diversity-promoting nature of existing ASG methods can lead to the "over-representation" of subgoals, generating numerous spurious subgoals of limited relevance to the actual task reward and thus decreasing the sample efficiency of the algorithm. To address this problem and inspired by the disentangled representation learning, we propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA), that prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning. Additionally, SAMA incorporates language-grounded RL to train each agent's subgoal-conditioned policy. SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods, as evidenced by its performance on two challenging sparse-reward tasks, Overcooked and MiniRTS.

翻译：暂无翻译

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

活化的PLC-γ及与Akt关联调控OA软骨基质代谢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Mg-Al-Ca-Sr镁合金热变形行为的位错机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Prohibitin调控癌组织内源性雄激素合成促进前列腺癌激素抵抗性进展机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

MAPK/ NF-к -PPARs交互串联信号通路调控肺炎衣原体促动脉粥样硬化形成的机制

国家自然科学基金

0+阅读 · 2009年12月31日

计算机辅助构建组织工程肝脏仿生三维血管网络的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Sample Efficient Deep Reinforcement Learning via Local Planning

Arxiv

0+阅读 · 2023年7月3日

Learning to Optimize for Reinforcement Learning

Arxiv

0+阅读 · 2023年7月2日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

MetAug: Contrastive Learning via Meta Feature Augmentation

Arxiv

10+阅读 · 2022年3月10日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用人工智能对军事行动进行建模》

《利用人工智能学习、优化与推演美国海军作战部队的战略布局与分散（续文）》

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

《指挥官意图消息中关键概念自动提取》最新47页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Sample Efficient Deep Reinforcement Learning via Local Planning

Arxiv

0+阅读 · 2023年7月3日

Learning to Optimize for Reinforcement Learning

Arxiv

0+阅读 · 2023年7月2日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

MetAug: Contrastive Learning via Meta Feature Augmentation

Arxiv

10+阅读 · 2022年3月10日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

相关基金

活化的PLC-γ及与Akt关联调控OA软骨基质代谢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Mg-Al-Ca-Sr镁合金热变形行为的位错机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Prohibitin调控癌组织内源性雄激素合成促进前列腺癌激素抵抗性进展机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

MAPK/ NF-к -PPARs交互串联信号通路调控肺炎衣原体促动脉粥样硬化形成的机制

国家自然科学基金

0+阅读 · 2009年12月31日

计算机辅助构建组织工程肝脏仿生三维血管网络的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员