传播效率政策分级分配强化学习方法 (Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning) - 专知论文

会员服务 ·

0

可交换的 · 学成 · 学习器 · 可约的 · Performance ·

2021 年 4 月 20 日

Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning

翻译：传播效率政策分级分配强化学习方法

Tianyi Chen,Kaiqing Zhang,Georgios B. Giannakis,Tamer Başar

from arxiv, To appear in IEEE Transactions on Control of Network Systems

This paper deals with distributed policy optimization in reinforcement learning, which involves a central controller and a group of learners. In particular, two typical settings encountered in several applications are considered: multi-agent reinforcement learning (RL) and parallel RL, where frequent information exchanges between the learners and the controller are required. For many practical distributed systems, however, the overhead caused by these frequent communication exchanges is considerable, and becomes the bottleneck of the overall performance. To address this challenge, a novel policy gradient approach is developed for solving distributed RL. The novel approach adaptively skips the policy gradient communication during iterations, and can reduce the communication overhead without degrading learning performance. It is established analytically that: i) the novel algorithm has convergence rate identical to that of the plain-vanilla policy gradient; while ii) if the distributed learners are heterogeneous in terms of their reward functions, the number of communication rounds needed to achieve a desirable learning accuracy is markedly reduced. Numerical experiments corroborate the communication reduction attained by the novel algorithm compared to alternatives.

翻译：本文论述在强化学习方面分散的政策优化,涉及中央控制员和一组学习者。特别是,在若干应用中遇到的两个典型环境得到考虑:多剂强化学习(RL)和平行RL,需要学习者与控制者经常交流信息。然而,对于许多实际分布的系统来说,这些频繁的交流交流所引发的间接费用相当可观,成为总体业绩的瓶颈。为了应对这一挑战,制定了一种新的政策梯度方法来解决分布式学习。新颖方法在迭代期间适应性地跳过政策梯度通信,可以减少通信间接费用而不降低学习成绩。它从分析上确定:(1) 新的算法具有与平凡利政策梯度相同的趋同率;(2) 如果分布的学习者在奖励功能方面各不相同,则实现理想学习准确性所需的通信轮数明显减少。数字实验证实了新算法相对于替代法的通信减少。

0

相关内容

可交换的

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

专知会员服务

84+阅读 · 2019年11月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【IJCAI 2019 | tutorial】双重机器学习 Dual Learning for Machine Learning，微软 | Tao Qin

【IJCAI 2019 | tutorial】双重机器学习 Dual Learning for Machine Learning，微软 | Tao Qin

专知会员服务

25+阅读 · 2019年8月12日

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

专知会员服务

65+阅读 · 2019年8月8日

【ALT 2019 Tutorials】强化学习的探索性开发（Exploration-Exploitation in Reinforcement Learning）

【ALT 2019 Tutorials】强化学习的探索性开发（Exploration-Exploitation in Reinforcement Learning）

专知会员服务

34+阅读 · 2019年3月21日

Federated Learning: 架构

Federated Learning: 架构

AINLP

4+阅读 · 2020年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning

Arxiv

0+阅读 · 2021年6月9日

Dynamic Sparse Training for Deep Reinforcement Learning

Arxiv

0+阅读 · 2021年6月8日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Lipschitz Lifelong Reinforcement Learning

Arxiv

4+阅读 · 2020年1月17日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Mean Field Multi-Agent Reinforcement Learning

Arxiv

5+阅读 · 2018年6月12日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年4月22日

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

Arxiv

4+阅读 · 2018年4月22日

VIP会员

文章信息

相关主题

相关VIP内容

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

【DeepMind-Nando de Freitas】强化学习教程，102页ppt，Reinforcement Learning

专知会员服务

84+阅读 · 2019年11月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【IJCAI 2019 | tutorial】双重机器学习 Dual Learning for Machine Learning，微软 | Tao Qin

【IJCAI 2019 | tutorial】双重机器学习 Dual Learning for Machine Learning，微软 | Tao Qin

专知会员服务

25+阅读 · 2019年8月12日

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

【KDD 2019|Tutorial】应用在交通中的强化学习 Deep Reinforcement Learning with Applications in Transportation，滴滴 AI Labs

专知会员服务

65+阅读 · 2019年8月8日

【ALT 2019 Tutorials】强化学习的探索性开发（Exploration-Exploitation in Reinforcement Learning）

【ALT 2019 Tutorials】强化学习的探索性开发（Exploration-Exploitation in Reinforcement Learning）

专知会员服务

34+阅读 · 2019年3月21日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

Federated Learning: 架构

Federated Learning: 架构

AINLP

4+阅读 · 2020年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning

Arxiv

0+阅读 · 2021年6月9日

Dynamic Sparse Training for Deep Reinforcement Learning

Arxiv

0+阅读 · 2021年6月8日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Lipschitz Lifelong Reinforcement Learning

Arxiv

4+阅读 · 2020年1月17日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Mean Field Multi-Agent Reinforcement Learning

Arxiv

5+阅读 · 2018年6月12日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年4月22日

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

Arxiv

4+阅读 · 2018年4月22日

微信扫码咨询专知VIP会员