机器人多智能体强化学习中谱归一化的效应 (Effects of Spectral Normalization in Multi-agent Reinforcement Learning) - 专知论文

会员服务 ·

0

评论员 · 多智能体 · 正则化 · 稀疏 · 归一化 ·

2023 年 4 月 20 日

Effects of Spectral Normalization in Multi-agent Reinforcement Learning

翻译：机器人多智能体强化学习中谱归一化的效应

Kinal Mehta,Anuj Mahajan,Pawan Kumar

from arxiv, Accepted at IJCNN-2023

A reliable critic is central to on-policy actor-critic learning. But it becomes challenging to learn a reliable critic in a multi-agent sparse reward scenario due to two factors: 1) The joint action space grows exponentially with the number of agents 2) This, combined with the reward sparseness and environment noise, leads to large sample requirements for accurate learning. We show that regularising the critic with spectral normalization (SN) enables it to learn more robustly, even in multi-agent on-policy sparse reward scenarios. Our experiments show that the regularised critic is quickly able to learn from the sparse rewarding experience in the complex SMAC and RWARE domains. These findings highlight the importance of regularisation in the critic for stable learning.

翻译：在策略演员-评论员学习中，一个可靠的评论员是至关重要的。但在多智能体稀疏奖励场景中学习可靠的评论员变得具有挑战性，这主要由两个因素造成：1）随着代理数量的增加，联合动作空间指数级增长2）这会导致需求大量的样本来进行精确学习，再加上奖励稀疏性和环境噪声，使得学习变得困难。我们展示了通过谱归一化（SN）正则化评论员，使其能够在多智能体策略稀疏奖励场景中更加稳定地学习。我们的实验表明，通过正则化评论员，它能够从复杂的SMAC和RWARE领域的稀疏奖励体验中快速学习。这些发现强调了评论员正则化在稳定学习方面的重要性。

0

相关内容

评论员

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

专知会员服务

231+阅读 · 2022年4月10日

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

专知会员服务

16+阅读 · 2022年3月29日

【布朗大学David Abel博士论文】A Theory of Abstraction in Reinforcement Learning

【布朗大学David Abel博士论文】A Theory of Abstraction in Reinforcement Learning

专知会员服务

25+阅读 · 2022年3月16日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

专知会员服务

121+阅读 · 2020年5月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

空间分数阶Schr？dinger方程的时间分裂谱方法

国家自然科学基金

0+阅读 · 2014年12月31日

新型靶向纳米药物Angio-Ag-NRPMAb的制备及其抑制胶质瘤的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

可溶性CEA诱导结直肠癌中内皮祖细胞归巢促进肿瘤血管新生的机制探讨

国家自然科学基金

0+阅读 · 2014年12月31日

基于fMRI脑功能成像的机器人辅助腕手神经康复训练与评价方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

长链非编码RNA-uc002mbe.2介导的HDACi凋亡效应及其在肝癌中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

新型钠钙交换蛋白NCEX对糖尿病大血管病变的作用和黄芪多糖的干预机制

国家自然科学基金

0+阅读 · 2011年12月31日

AlGaN基PIN太阳光盲雪崩探测器研究

国家自然科学基金

0+阅读 · 2008年12月31日

低功率聚焦超声联合微泡靶向增强脑组织药物浓度的机制

国家自然科学基金

0+阅读 · 2008年12月31日

Enabling Efficient Interaction between an Algorithm Agent and an LLM: A Reinforcement Learning Approach

Arxiv

0+阅读 · 2023年6月6日

Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Arxiv

0+阅读 · 2023年6月5日

An adaptive safety layer with hard constraints for safe reinforcement learning in multi-energy management systems

Arxiv

0+阅读 · 2023年6月5日

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年6月5日

XRoute Environment: A Novel Reinforcement Learning Environment for Routing

Arxiv

0+阅读 · 2023年6月5日

Context-Aware Bayesian Network Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年6月2日

Task-Agnostic Structured Pruning of Speech Representation Models

Arxiv

0+阅读 · 2023年6月2日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation

Arxiv

15+阅读 · 2019年1月23日

VIP会员

文章信息

相关主题

相关VIP内容

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

专知会员服务

231+阅读 · 2022年4月10日

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

【ToG 2021】强化学习中图像局部区域敏感的探索奖励，Deep Reinforcement Learning with Part-aware Exploration Bonus in Video Games

专知会员服务

16+阅读 · 2022年3月29日

【布朗大学David Abel博士论文】A Theory of Abstraction in Reinforcement Learning

【布朗大学David Abel博士论文】A Theory of Abstraction in Reinforcement Learning

专知会员服务

25+阅读 · 2022年3月16日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

专知会员服务

121+阅读 · 2020年5月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

【强化学习研讨会|Microsoft Research】多智能体强化学习 Scalable and Robust Multi-Agent Reinforcement Learning，46页pdf，美国东北大学|Christopher Amato

专知会员服务

26+阅读 · 2019年10月3日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Enabling Efficient Interaction between an Algorithm Agent and an LLM: A Reinforcement Learning Approach

Arxiv

0+阅读 · 2023年6月6日

Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Arxiv

0+阅读 · 2023年6月5日

An adaptive safety layer with hard constraints for safe reinforcement learning in multi-energy management systems

Arxiv

0+阅读 · 2023年6月5日

Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年6月5日

XRoute Environment: A Novel Reinforcement Learning Environment for Routing

Arxiv

0+阅读 · 2023年6月5日

Context-Aware Bayesian Network Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年6月2日

Task-Agnostic Structured Pruning of Speech Representation Models

Arxiv

0+阅读 · 2023年6月2日

A Survey of Meta-Reinforcement Learning

Arxiv

12+阅读 · 2023年1月19日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation

Arxiv

15+阅读 · 2019年1月23日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

空间分数阶Schr？dinger方程的时间分裂谱方法

国家自然科学基金

0+阅读 · 2014年12月31日

新型靶向纳米药物Angio-Ag-NRPMAb的制备及其抑制胶质瘤的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

可溶性CEA诱导结直肠癌中内皮祖细胞归巢促进肿瘤血管新生的机制探讨

国家自然科学基金

0+阅读 · 2014年12月31日

基于fMRI脑功能成像的机器人辅助腕手神经康复训练与评价方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

长链非编码RNA-uc002mbe.2介导的HDACi凋亡效应及其在肝癌中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

新型钠钙交换蛋白NCEX对糖尿病大血管病变的作用和黄芪多糖的干预机制

国家自然科学基金

0+阅读 · 2011年12月31日

AlGaN基PIN太阳光盲雪崩探测器研究

国家自然科学基金

0+阅读 · 2008年12月31日

低功率聚焦超声联合微泡靶向增强脑组织药物浓度的机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员