为多阶段强化学习任务制定合作政策 (Developing cooperative policies for multi-stage reinforcement learning tasks) - 专知论文

会员服务 ·

0

相互独立的 · 评论员 · 学成 · 强化学习 · 分层强化学习 ·

2022 年 5 月 11 日

Developing cooperative policies for multi-stage reinforcement learning tasks

翻译：为多阶段强化学习任务制定合作政策

Jordan Erskine,Chris Lehnert

from arxiv, This paper supersedes the rejected paper "Developing cooperative policies for multi-stage tasks". arXiv admin note: substantial text overlap with arXiv:2007.00203

Many hierarchical reinforcement learning algorithms utilise a series of independent skills as a basis to solve tasks at a higher level of reasoning. These algorithms don't consider the value of using skills that are cooperative instead of independent. This paper proposes the Cooperative Consecutive Policies (CCP) method of enabling consecutive agents to cooperatively solve long time horizon multi-stage tasks. This method is achieved by modifying the policy of each agent to maximise both the current and next agent's critic. Cooperatively maximising critics allows each agent to take actions that are beneficial for its task as well as subsequent tasks. Using this method in a multi-room maze domain and a peg in hole manipulation domain, the cooperative policies were able to outperform a set of naive policies, a single agent trained across the entire domain, as well as another sequential HRL algorithm.

翻译：许多等级强化学习算法利用一系列独立技能作为解决更高层次推理任务的基础。这些算法不考虑使用合作而不是独立技能的价值。本文件提出合作连续代理商合作解决长期跨时跨跨阶段任务的方法。实现这一方法的途径是修改每个代理商的政策,使当前和下一个代理商的批评意见最大化。合作最大化的批评者允许每个代理商采取有利于其任务和随后任务的行动。在多房间迷宫领域和孔操作领域使用这种方法,合作政策能够超越一套天真政策、一个在整个领域受训的单一代理商以及另一个连续的HRL算法。

0

相关内容

相互独立的

相互独立的

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

专知会员服务

28+阅读 · 2022年2月20日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

药油兼用红花品质形成的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于MMP-2的MRI分子成像评价糖尿病动脉粥样硬化斑块稳定性的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

pH逐级响应型多功能纳米药物载体的制备及其抗肿瘤活性评估

国家自然科学基金

0+阅读 · 2013年12月31日

Yb离子和Ce离子共掺以增强GaN:Er微纳米晶发光性能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

重金属废水制备新型Ferrite/LDH纳米复合材料及其催化吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

长时间cAMP刺激致尿素转运蛋白A1泛素化、胞吞与降解的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

水稻ERF转录因子家族IX亚组成员在抗病性中的功能及其作用机制

国家自然科学基金

1+阅读 · 2012年12月31日

钙信号系统调控香蕉耐盐生理和分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

含氮共轭聚合物与无机半导体杂化光催化剂的设计、制备与催化机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

共价嫁接铂卟啉配合物介孔分子筛的制备及氧传感性能

国家自然科学基金

0+阅读 · 2008年12月31日

Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings

Arxiv

0+阅读 · 2022年7月1日

Lifelong Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年7月1日

Modular Lifelong Reinforcement Learning via Neural Composition

Arxiv

0+阅读 · 2022年7月1日

Performative Reinforcement Learning

Arxiv

0+阅读 · 2022年6月30日

Deep Reinforcement Learning with Swin Transformer

Arxiv

0+阅读 · 2022年6月30日

Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks

Arxiv

0+阅读 · 2022年6月30日

How to Leverage Unlabeled Data in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

Discovering Synergies for Robot Manipulation with Multi-Task Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

Consensus Learning for Cooperative Multi-Agent Reinforcement Learning

Arxiv

1+阅读 · 2022年6月29日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Arxiv

17+阅读 · 2020年3月12日

VIP会员

文章信息

相关主题

相互独立的

分层强化学习

相关VIP内容

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

Artificial Intelligence: Ready to Ride the Wave? BCG 28页PPT

专知会员服务

28+阅读 · 2022年2月20日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings

Arxiv

0+阅读 · 2022年7月1日

Lifelong Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年7月1日

Modular Lifelong Reinforcement Learning via Neural Composition

Arxiv

0+阅读 · 2022年7月1日

Performative Reinforcement Learning

Arxiv

0+阅读 · 2022年6月30日

Deep Reinforcement Learning with Swin Transformer

Arxiv

0+阅读 · 2022年6月30日

Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks

Arxiv

0+阅读 · 2022年6月30日

How to Leverage Unlabeled Data in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

Discovering Synergies for Robot Manipulation with Multi-Task Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

Consensus Learning for Cooperative Multi-Agent Reinforcement Learning

Arxiv

1+阅读 · 2022年6月29日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Arxiv

17+阅读 · 2020年3月12日

相关基金

药油兼用红花品质形成的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于MMP-2的MRI分子成像评价糖尿病动脉粥样硬化斑块稳定性的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

pH逐级响应型多功能纳米药物载体的制备及其抗肿瘤活性评估

国家自然科学基金

0+阅读 · 2013年12月31日

Yb离子和Ce离子共掺以增强GaN:Er微纳米晶发光性能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

重金属废水制备新型Ferrite/LDH纳米复合材料及其催化吸附机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

长时间cAMP刺激致尿素转运蛋白A1泛素化、胞吞与降解的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

水稻ERF转录因子家族IX亚组成员在抗病性中的功能及其作用机制

国家自然科学基金

1+阅读 · 2012年12月31日

钙信号系统调控香蕉耐盐生理和分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

含氮共轭聚合物与无机半导体杂化光催化剂的设计、制备与催化机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

共价嫁接铂卟啉配合物介孔分子筛的制备及氧传感性能

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员