利用知识转让,在合作多剂环境中改进强化学习 (Improved Reinforcement Learning in Cooperative Multi-agent Environments Using Knowledge Transfer) - 专知论文

会员服务 ·

0

学成 · 状态空间 · 回合 · 强化学习 · TEAM ·

2022 年 1 月 17 日

Improved Reinforcement Learning in Cooperative Multi-agent Environments Using Knowledge Transfer

翻译：利用知识转让,在合作多剂环境中改进强化学习

Mahnoosh Mahdavimoghaddam,Amin Nikanjam,Monireh Abdoos

from arxiv, Accepted for publication by The Journal of Supercomputing

Nowadays, cooperative multi-agent systems are used to learn how to achieve goals in large-scale dynamic environments. However, learning in these environments is challenging: from the effect of search space size on learning time to inefficient cooperation among agents. Moreover, reinforcement learning algorithms may suffer from a long time of convergence in such environments. In this paper, a communication framework is introduced. In the proposed communication framework, agents learn to cooperate effectively and also by introduction of a new state calculation method the size of state space will decline considerably. Furthermore, a knowledge-transferring algorithm is presented to share the gained experiences among the different agents, and develop an effective knowledge-fusing mechanism to fuse the knowledge learnt utilizing the agents' own experiences with the knowledge received from other team members. Finally, the simulation results are provided to indicate the efficacy of the proposed method in the complex learning task. We have evaluated our approach on the shepherding problem and the results show that the learning process accelerates by making use of the knowledge transferring mechanism and the size of state space has declined by generating similar states based on state abstraction concept.

翻译：目前,合作性多试剂系统被用来学习如何在大规模动态环境中实现目标。然而,在这些环境中的学习具有挑战性:从搜索空间的大小对学习时间的影响到代理人之间合作效率低下。此外,强化学习算法可能因在这种环境中的长期趋同而受影响。在本文件中,引入了一个通信框架。在拟议的通信框架中,代理人学习有效合作,并且通过采用新的国家计算方法,国家空间的规模将大大缩小。此外,还提出知识转让算法,在不同代理人之间分享所取得的经验,并开发有效的知识应用机制,利用代理人自身的经验将所学知识与从其他小组成员获得的知识结合起来。最后,提供模拟结果,以表明拟议的方法在复杂学习任务中的功效。我们评估了我们关于引导问题的方法,结果显示,通过利用知识转让机制和国家空间的规模,学习进程加快速度,因为根据国家抽象概念产生了类似的状态。

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

【UIUC】基于知识图谱和语料库的协同推理:一个多智能体强化学习方法（Cooperative Reasoning on Knowledge Graph and Corpus: A Multi-agentReinforcement Learning Approach）

【UIUC】基于知识图谱和语料库的协同推理:一个多智能体强化学习方法（Cooperative Reasoning on Knowledge Graph and Corpus: A Multi-agentReinforcement Learning Approach）

专知会员服务

26+阅读 · 2019年12月7日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

MFC产电驱动-ZnFe2O4/TiO2可见光催化-H2O2氧化耦合体系构筑及协同降解作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

群智感知中基于可信交互的细粒度众包机制研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

45+阅读 · 2015年12月31日

金融市场multi-agent异质信息的风险形成机理及预警研究

国家自然科学基金

3+阅读 · 2013年12月31日

《物理》期刊

国家自然科学基金

4+阅读 · 2013年2月4日

基于事件触发机制的多智能体系统分布式协调控制研究

国家自然科学基金

3+阅读 · 2012年12月31日

吲哚类导电高分子的电致变色性能及其电致变色器件

国家自然科学基金

0+阅读 · 2012年12月31日

汽车复杂约束下的多目标集成控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

自治微电网多模态协调切换混杂控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于群体智能的多Agent协作模型与适应性研究

国家自然科学基金

17+阅读 · 2009年12月31日

Understanding and Preventing Capacity Loss in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning

Arxiv

1+阅读 · 2022年4月20日

Efficient Bayesian Policy Reuse with a Scalable Observation Model in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年4月19日

Multi-UAV Collision Avoidance using Multi-Agent Reinforcement Learning with Counterfactual Credit Assignment

Arxiv

0+阅读 · 2022年4月19日

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Arxiv

1+阅读 · 2022年4月18日

Spot the Difference: A Novel Task for Embodied Agents in Changing Environments

Arxiv

0+阅读 · 2022年4月18日

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Arxiv

1+阅读 · 2022年4月14日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

【UIUC】基于知识图谱和语料库的协同推理:一个多智能体强化学习方法（Cooperative Reasoning on Knowledge Graph and Corpus: A Multi-agentReinforcement Learning Approach）

【UIUC】基于知识图谱和语料库的协同推理:一个多智能体强化学习方法（Cooperative Reasoning on Knowledge Graph and Corpus: A Multi-agentReinforcement Learning Approach）

专知会员服务

26+阅读 · 2019年12月7日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《大模型一体机应用研究报告（2025年）》，48页pdf

更智能的人工智能实现更快速的电磁辐射控制（EMCON）

【CMU博士论文】迈向具备基础先验的四维感知

大语言模型机器遗忘综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Understanding and Preventing Capacity Loss in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning

Arxiv

1+阅读 · 2022年4月20日

Efficient Bayesian Policy Reuse with a Scalable Observation Model in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年4月19日

Multi-UAV Collision Avoidance using Multi-Agent Reinforcement Learning with Counterfactual Credit Assignment

Arxiv

0+阅读 · 2022年4月19日

Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models

Arxiv

1+阅读 · 2022年4月18日

Spot the Difference: A Novel Task for Embodied Agents in Changing Environments

Arxiv

0+阅读 · 2022年4月18日

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Arxiv

1+阅读 · 2022年4月14日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

相关基金

MFC产电驱动-ZnFe2O4/TiO2可见光催化-H2O2氧化耦合体系构筑及协同降解作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

群智感知中基于可信交互的细粒度众包机制研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

45+阅读 · 2015年12月31日

金融市场multi-agent异质信息的风险形成机理及预警研究

国家自然科学基金

3+阅读 · 2013年12月31日

《物理》期刊

国家自然科学基金

4+阅读 · 2013年2月4日

基于事件触发机制的多智能体系统分布式协调控制研究

国家自然科学基金

3+阅读 · 2012年12月31日

吲哚类导电高分子的电致变色性能及其电致变色器件

国家自然科学基金

0+阅读 · 2012年12月31日

汽车复杂约束下的多目标集成控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

自治微电网多模态协调切换混杂控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于群体智能的多Agent协作模型与适应性研究

国家自然科学基金

17+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员