通过吸收中央化主计长而分散管理的多权力机关 (Decentralized Multi-Agents by Imitation of a Centralized Controller) - 专知论文

会员服务 ·

0

学成 · INTERACT · 控制器 · contrastive · 强化学习 ·

2021 年 4 月 22 日

Decentralized Multi-Agents by Imitation of a Centralized Controller

翻译：通过吸收中央化主计长而分散管理的多权力机关

Alex Tong Lin,Mark J. Debord,Katia Estabridis,Gary Hewer,Guido Montufar,Stanley Osher

We consider a multi-agent reinforcement learning problem where each agent seeks to maximize a shared reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent is situated in a non-stationary and partially-observable environment. In order to obtain multi-agents that act in a decentralized manner, we introduce a novel algorithm under the popular framework of centralized training, but decentralized execution. This training framework first obtains solutions to a multi-agent problem with a single centralized joint-space learner, which is then used to guide imitation learning for independent decentralized multi-agents. This framework has the flexibility to use any reinforcement learning algorithm to obtain the expert as well as any imitation learning algorithm to obtain the decentralized agents. This is in contrast to other multi-agent learning algorithms that, for example, can require more specific structures. We present some theoretical bounds for our method, and we show that one can obtain decentralized solutions to a multi-agent problem through imitation learning.

翻译：我们认为一个多试剂强化学习问题,即每个代理商在与其他代理商互动时寻求最大限度分享奖励,而他们可能或可能无法沟通。一般而言,代理商无法获得其他代理政策,因此,每个代理商都处于非静止和部分可观测的环境中。为了获得以分散方式运作的多试剂,我们在集中化培训的大众框架内引入了一种新型算法,但执行权分散化。这个培训框架首先用一个单一的集中式联合空间学习者来找到解决多试剂问题的办法,然后用来指导独立分散式多试剂的仿照学习。这个框架具有灵活性,可以使用任何强化式学习算法获取专家以及任何仿照式学习算法获取分散式代理商。这与其他多试剂学习算法形成鲜明对比,例如,它需要更具体的结构。我们的方法有一些理论界限,并且我们表明,一个人可以通过模仿学习获得多试剂问题的分散式解决方案。

0

相关内容

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

专知会员服务

103+阅读 · 2020年2月1日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization

A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization

Arxiv

0+阅读 · 2021年6月14日

Decentralized Personalized Federated Min-Max Problems

Arxiv

0+阅读 · 2021年6月14日

Decentralized Inertial Best-Response with Voluntary and Limited Communication in Random Communication Networks

Arxiv

0+阅读 · 2021年6月13日

Optimal Complexity in Decentralized Training

Arxiv

1+阅读 · 2021年6月11日

On the efficiency of decentralized epidemic management and application to Covid-19

Arxiv

0+阅读 · 2021年6月11日

Adversarial Option-Aware Hierarchical Imitation Learning

Arxiv

0+阅读 · 2021年6月11日

A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max Optimization Problems

Arxiv

0+阅读 · 2021年6月10日

PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction

Arxiv

0+阅读 · 2021年6月5日

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

Arxiv

0+阅读 · 2021年5月31日

On Improving Decentralized Hysteretic Deep Reinforcement Learning

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月15日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

专知会员服务

103+阅读 · 2020年2月1日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025】基于奖励引导解码的多模态大语言模型控制

【CMU博士论文】基于深度学习的高效贝叶斯实验设计

《数据安全国家标准体系（2025版）》征求意见稿

2025年中国AI算力基础设施发展趋势洞察

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

相关论文

A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization

A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization

Arxiv

0+阅读 · 2021年6月14日

Decentralized Personalized Federated Min-Max Problems

Arxiv

0+阅读 · 2021年6月14日

Decentralized Inertial Best-Response with Voluntary and Limited Communication in Random Communication Networks

Arxiv

0+阅读 · 2021年6月13日

Optimal Complexity in Decentralized Training

Arxiv

1+阅读 · 2021年6月11日

On the efficiency of decentralized epidemic management and application to Covid-19

Arxiv

0+阅读 · 2021年6月11日

Adversarial Option-Aware Hierarchical Imitation Learning

Arxiv

0+阅读 · 2021年6月11日

A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max Optimization Problems

Arxiv

0+阅读 · 2021年6月10日

PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction

Arxiv

0+阅读 · 2021年6月5日

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

Arxiv

0+阅读 · 2021年5月31日

On Improving Decentralized Hysteretic Deep Reinforcement Learning

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月15日

微信扫码咨询专知VIP会员