多方代理LQR的拆分和平行计算 (Decomposability and Parallel Computation of Multi-Agent LQR) - 专知论文

会员服务 ·

0

PARCO · 线性的 · 学成 · 变换 · 控制器 ·

2021 年 3 月 7 日

Decomposability and Parallel Computation of Multi-Agent LQR

翻译：多方代理LQR的拆分和平行计算

Gangshan Jing,He Bai,Jemin George,Aranya Chakrabortty

from arxiv, This paper contains proofs of all the theorems in the conference paper "Decomposability and Parallel Computation of Multi-Agent LQR"

Individual agents in a multi-agent system (MAS) may have decoupled open-loop dynamics, but a cooperative control objective usually results in coupled closed-loop dynamics thereby making the control design computationally expensive. The computation time becomes even higher when a learning strategy such as reinforcement learning (RL) needs to be applied to deal with the situation when the agents dynamics are not known. To resolve this problem, we propose a parallel RL scheme for a linear quadratic regulator (LQR) design in a continuous-time linear MAS. The idea is to exploit the structural properties of two graphs embedded in the $Q$ and $R$ weighting matrices in the LQR objective to define an orthogonal transformation that can convert the original LQR design to multiple decoupled smaller-sized LQR designs. We show that if the MAS is homogeneous then this decomposition retains closed-loop optimality. Conditions for decomposability, an algorithm for constructing the transformation matrix, a parallel RL algorithm, and robustness analysis when the design is applied to non-homogeneous MAS are presented. Simulations show that the proposed approach can guarantee significant speed-up in learning without any loss in the cumulative value of the LQR cost.

翻译：多试剂系统(MAS)中的个别物剂可能已经分解开开环状动态,但合作控制目标通常导致封闭环状动态的结合,从而使控制设计计算费用昂贵。当需要应用强化学习(RL)等学习战略来处理代理动态未知的情况时,计算时间就变得更高。为了解决这个问题,我们提议在连续时间线性MAS中为线性四管调节器设计平行的RL计划。设想是利用LQR目标中嵌入的两张图表的结构属性,使封闭环状图和超环加权表中的美元使控制设计成本昂贵。当将原LQR设计转换成多种分解的较小LQR设计时,计算时间就会变得更高。我们表明,如果MAS是均匀的,那么这种分解组合保留了闭环式最佳性。在连续时间性线性MAS设计中,构建变异矩阵的算法,平行RLL算法,以及当设计应用于LQRR目标中的非热性加权矩阵矩阵时,那么在累积性亏损率方法中,就能够保证拟议的模拟学习重大速度方法。

0

相关内容

PARCO

PARCO：Parallel Computing。 Explanation：并行计算。 Publisher：Elsevier。 SIT:http://dblp.uni-trier.de/db/conf/parco/

不可错过！「强化学习导论」多伦多大学2021课程，附SLIDES与140页pdf

不可错过！「强化学习导论」多伦多大学2021课程，附SLIDES与140页pdf

专知会员服务

67+阅读 · 2021年3月24日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

Python计算导论，560页pdf，Introduction to Computing Using Python

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

76+阅读 · 2020年5月5日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

专知会员服务

235+阅读 · 2020年1月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

【微软Alekh等开放新书】强化学习理论与算法，83页pdf，了解最新进展

【微软Alekh等开放新书】强化学习理论与算法，83页pdf，了解最新进展

专知

25+阅读 · 2019年11月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

QoS-Aware Placement of Deep Learning Services on the Edge with Multiple Service Implementations

Arxiv

0+阅读 · 2021年4月30日

Discrete-Time Mean Field Control with Environment States

Arxiv

0+阅读 · 2021年4月30日

On Uninformative Optimal Policies in Adaptive LQR with Unknown B-Matrix

Arxiv

0+阅读 · 2021年4月30日

Joint QoS-aware and Cost-efficient Task Scheduling for Fog-Cloud Resources in a Volunteer Computing System

Arxiv

0+阅读 · 2021年4月28日

Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients

Arxiv

0+阅读 · 2021年4月27日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

Efficient and Effective $L_0$ Feature Selection

Efficient and Effective $L_0$ Feature Selection

Arxiv

5+阅读 · 2018年8月7日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

Modeling Others using Oneself in Multi-Agent Reinforcement Learning

Arxiv

4+阅读 · 2018年3月22日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！「强化学习导论」多伦多大学2021课程，附SLIDES与140页pdf

不可错过！「强化学习导论」多伦多大学2021课程，附SLIDES与140页pdf

专知会员服务

67+阅读 · 2021年3月24日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

Python计算导论，560页pdf，Introduction to Computing Using Python

Python计算导论，560页pdf，Introduction to Computing Using Python

专知会员服务

76+阅读 · 2020年5月5日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

专知会员服务

235+阅读 · 2020年1月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

【微软Alekh等开放新书】强化学习理论与算法，83页pdf，了解最新进展

【微软Alekh等开放新书】强化学习理论与算法，83页pdf，了解最新进展

专知

25+阅读 · 2019年11月23日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

QoS-Aware Placement of Deep Learning Services on the Edge with Multiple Service Implementations

Arxiv

0+阅读 · 2021年4月30日

Discrete-Time Mean Field Control with Environment States

Arxiv

0+阅读 · 2021年4月30日

On Uninformative Optimal Policies in Adaptive LQR with Unknown B-Matrix

Arxiv

0+阅读 · 2021年4月30日

Joint QoS-aware and Cost-efficient Task Scheduling for Fog-Cloud Resources in a Volunteer Computing System

Arxiv

0+阅读 · 2021年4月28日

Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients

Arxiv

0+阅读 · 2021年4月27日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

Efficient and Effective $L_0$ Feature Selection

Efficient and Effective $L_0$ Feature Selection

Arxiv

5+阅读 · 2018年8月7日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

Modeling Others using Oneself in Multi-Agent Reinforcement Learning

Arxiv

4+阅读 · 2018年3月22日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员