多机构 MDP 多机构 MDP 随时规划 (Scalable Anytime Planning for Multi-Agent MDPs) - 专知论文

会员服务 ·

0

INTERACT · Performer · 蒙特卡洛树搜索 · 蒙特卡罗 · 计算成本 ·

2021 年 1 月 12 日

Scalable Anytime Planning for Multi-Agent MDPs

翻译：多机构 MDP 多机构 MDP 随时规划

Shushman Choudhury,Jayesh K. Gupta,Peter Morales,Mykel J. Kochenderfer

from arxiv, First two authors contributed equally. Accepted at AAMAS 2021

We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential growth of the joint action space with the number of agents. We circumvent this complexity through an anytime approach that allows us to trade computation for approximation quality and also dynamically coordinate actions. Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factored representations of local agent interactions with coordination graphs, and the iterative Max-Plus method for joint action selection. We evaluate our approach on the benchmark SysAdmin domain with static coordination graphs and achieve comparable performance with much lower computation cost than our MCTS baselines. We also introduce a multi-drone delivery domain with dynamic, i.e., state-dependent coordination graphs, and demonstrate how our approach scales to large problems on this domain that are intractable for other MCTS methods. We provide an open-source implementation of our algorithm at https://github.com/JuliaPOMDP/FactoredValueMCTS.jl.

翻译：我们为大型多试剂连续决策问题提出了一个可扩缩的树搜索规划算法,这需要动态协作。各代理团队需要在许多领域协调决策,但天真的方法却因联合行动空间与代理人数目的指数性增长而失败。我们通过一个能够随时以近似质量和动态协调行动来交换计算方法来规避这一复杂性。我们的算法包括三个要素:与蒙特卡洛树搜索(MCTS)的在线规划,当地代理互动与协调图的因数代表,以及用于联合行动选择的迭接最大-普卢斯方法。我们用静态协调图表来评估我们在SysAdmin基准域上的做法,并以比我们的MCTS基线低得多的计算成本实现可比的性能。我们还引入了一个具有动态性的多轨交付域,即依靠状态的协调图,并展示我们如何在对这一领域上其他MCTS方法难以解决的大问题采取方法。我们在https://github.com/JuliaPOMDP/FactoredValMESTS.jl提供我们算法的开源的实施。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

强化学习算法与应用综述(中文版)， 13页pdf

专知会员服务

118+阅读 · 2020年12月17日

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

专知会员服务

27+阅读 · 2020年8月6日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

最新《智能交通系统的深度强化学习》综述论文，22页pdf

最新《智能交通系统的深度强化学习》综述论文，22页pdf

专知会员服务

187+阅读 · 2020年5月5日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【新书】Python强化学习-基于Tensorflow与Keras和OpenAI Gym实战, 177页pdf

【新书】Python强化学习-基于Tensorflow与Keras和OpenAI Gym实战, 177页pdf

专知会员服务

184+阅读 · 2020年1月17日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

121+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

2018机器学习开源资源盘点

2018机器学习开源资源盘点

专知

6+阅读 · 2019年2月2日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Certified dimension reduction in nonlinear Bayesian inverse problems

Arxiv

0+阅读 · 2021年3月9日

Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

Arxiv

0+阅读 · 2021年3月8日

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

Arxiv

0+阅读 · 2021年3月8日

Adaptive Agent Architecture for Real-time Human-Agent Teaming

Arxiv

0+阅读 · 2021年3月7日

Learning When to Quit: Meta-Reasoning for Motion Planning

Arxiv

0+阅读 · 2021年3月7日

TIE: Time-Informed Exploration For Robot Motion Planning

Arxiv

0+阅读 · 2021年3月5日

MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models

Arxiv

0+阅读 · 2021年3月5日

Memetic Search for Vehicle Routing with Simultaneous Pickup-Delivery and Time Windows

Arxiv

0+阅读 · 2021年3月5日

Mean Field Multi-Agent Reinforcement Learning

Arxiv

5+阅读 · 2018年6月12日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

蒙特卡洛树搜索

相关VIP内容

强化学习算法与应用综述(中文版)， 13页pdf

专知会员服务

118+阅读 · 2020年12月17日

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

专知会员服务

27+阅读 · 2020年8月6日

【Manning新书】现代Java实战，592页pdf

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

最新《智能交通系统的深度强化学习》综述论文，22页pdf

最新《智能交通系统的深度强化学习》综述论文，22页pdf

专知会员服务

187+阅读 · 2020年5月5日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【新书】Python强化学习-基于Tensorflow与Keras和OpenAI Gym实战, 177页pdf

【新书】Python强化学习-基于Tensorflow与Keras和OpenAI Gym实战, 177页pdf

专知会员服务

184+阅读 · 2020年1月17日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

121+阅读 · 2019年11月24日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

2018机器学习开源资源盘点

2018机器学习开源资源盘点

专知

6+阅读 · 2019年2月2日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Certified dimension reduction in nonlinear Bayesian inverse problems

Arxiv

0+阅读 · 2021年3月9日

Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

Arxiv

0+阅读 · 2021年3月8日

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

Arxiv

0+阅读 · 2021年3月8日

Adaptive Agent Architecture for Real-time Human-Agent Teaming

Arxiv

0+阅读 · 2021年3月7日

Learning When to Quit: Meta-Reasoning for Motion Planning

Arxiv

0+阅读 · 2021年3月7日

TIE: Time-Informed Exploration For Robot Motion Planning

Arxiv

0+阅读 · 2021年3月5日

MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models

Arxiv

0+阅读 · 2021年3月5日

Memetic Search for Vehicle Routing with Simultaneous Pickup-Delivery and Time Windows

Arxiv

0+阅读 · 2021年3月5日

Mean Field Multi-Agent Reinforcement Learning

Arxiv

5+阅读 · 2018年6月12日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

微信扫码咨询专知VIP会员