StarCraft 多重机会挑战+:学习多层次任务和环境因素而没有准确的回报功能 (The StarCraft Multi-Agent Challenges+ : Learning of Multi-Stage Tasks and Environmental Factors without Precise Reward Functions) - 专知论文

会员服务 ·

0

Agent · Learning · 分解的 · 回合 · 查准率/准确率 ·

2022 年 7 月 7 日

The StarCraft Multi-Agent Challenges+ : Learning of Multi-Stage Tasks and Environmental Factors without Precise Reward Functions

翻译：StarCraft 多重机会挑战+:学习多层次任务和环境因素而没有准确的回报功能

Mingyu Kim,Jihwan Oh,Yongsik Lee,Joonkee Kim,Seonghwan Kim,Song Chong,Se-Young Yun

from arxiv, ICML Workshop: AI for Agent Based Modeling 2022 Spotlight

In this paper, we propose a novel benchmark called the StarCraft Multi-Agent Challenges+, where agents learn to perform multi-stage tasks and to use environmental factors without precise reward functions. The previous challenges (SMAC) recognized as a standard benchmark of Multi-Agent Reinforcement Learning are mainly concerned with ensuring that all agents cooperatively eliminate approaching adversaries only through fine manipulation with obvious reward functions. This challenge, on the other hand, is interested in the exploration capability of MARL algorithms to efficiently learn implicit multi-stage tasks and environmental factors as well as micro-control. This study covers both offensive and defensive scenarios. In the offensive scenarios, agents must learn to first find opponents and then eliminate them. The defensive scenarios require agents to use topographic features. For example, agents need to position themselves behind protective structures to make it harder for enemies to attack. We investigate MARL algorithms under SMAC+ and observe that recent approaches work well in similar settings to the previous challenges, but misbehave in offensive scenarios. Additionally, we observe that an enhanced exploration approach has a positive effect on performance but is not able to completely solve all scenarios. This study proposes new directions for future research.

翻译：在本文中,我们提出一个新的基准,即StarCraft多重竞争挑战+,即代理商学会执行多阶段任务和使用环境因素,而没有精确的奖励功能。以前的挑战(SMAC)被确认为多机构强化学习的标准基准,主要是为了确保所有代理商合作消除与对手的接触,但只有用明显的奖励功能进行精细的操纵。另一方面,这一挑战对MARL算法的探索能力感兴趣,以便有效地学习隐含的多阶段任务和环境因素以及微观控制。这项研究涵盖进攻性和防御性两种情况。在进攻性情况中,代理商必须学会首先找到对手,然后消灭它们。防御性情况要求代理商使用地形特征。例如,代理商需要站在保护结构后面,以便敌人更难于攻击。我们在SMAC+下对MARL算法进行了调查,并观察到最近的方法在类似情况下与以往的挑战很有效,但在攻击性情况下有误。此外,我们认为,强化的勘探方法对业绩都有积极的影响,但不能完全解决所有情况。本研究报告提出了未来研究的新方向。

0

相关内容

Agent

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

二维主体层板拓扑转变制备负载型金属催化剂介尺度表界面结构的构筑与调控

国家自然科学基金

0+阅读 · 2015年12月31日

含频变、幅变非线性橡胶衬套的车辆多体系统动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

重载车辆ECAS/CTIS集成系统耦合机理及主动控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Monge-Ampère 方程数值算法的研究

国家自然科学基金

0+阅读 · 2013年12月31日

KAT2B对脂肪细胞分化调控的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

受时变对流扩散方程约束的最优控制问题的SUPG方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

几何结构形变空间的几何拓扑

国家自然科学基金

0+阅读 · 2012年12月31日

STIM1突变与核浆钙信号调控

国家自然科学基金

0+阅读 · 2012年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

On stabilizing reinforcement learning without Lyapunov functions

Arxiv

0+阅读 · 2022年8月26日

Safety-Critical Learning of Robot Control with Temporal Logic Specifications

Safety-Critical Learning of Robot Control with Temporal Logic Specifications

Arxiv

0+阅读 · 2022年8月26日

Light-weight probing of unsupervised representations for Reinforcement Learning

Arxiv

0+阅读 · 2022年8月25日

A Comparison of Reinforcement Learning Frameworks for Software Testing Tasks

A Comparison of Reinforcement Learning Frameworks for Software Testing Tasks

Arxiv

0+阅读 · 2022年8月25日

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Arxiv

42+阅读 · 2022年6月15日

A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities

Arxiv

52+阅读 · 2022年5月13日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Arxiv

10+阅读 · 2022年2月10日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

相关论文

On stabilizing reinforcement learning without Lyapunov functions

Arxiv

0+阅读 · 2022年8月26日

Safety-Critical Learning of Robot Control with Temporal Logic Specifications

Safety-Critical Learning of Robot Control with Temporal Logic Specifications

Arxiv

0+阅读 · 2022年8月26日

Light-weight probing of unsupervised representations for Reinforcement Learning

Arxiv

0+阅读 · 2022年8月25日

A Comparison of Reinforcement Learning Frameworks for Software Testing Tasks

A Comparison of Reinforcement Learning Frameworks for Software Testing Tasks

Arxiv

0+阅读 · 2022年8月25日

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Arxiv

42+阅读 · 2022年6月15日

A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities

Arxiv

52+阅读 · 2022年5月13日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Arxiv

10+阅读 · 2022年2月10日

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems

Arxiv

33+阅读 · 2022年1月11日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

相关基金

二维主体层板拓扑转变制备负载型金属催化剂介尺度表界面结构的构筑与调控

国家自然科学基金

0+阅读 · 2015年12月31日

含频变、幅变非线性橡胶衬套的车辆多体系统动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

重载车辆ECAS/CTIS集成系统耦合机理及主动控制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Monge-Ampère 方程数值算法的研究

国家自然科学基金

0+阅读 · 2013年12月31日

KAT2B对脂肪细胞分化调控的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

受时变对流扩散方程约束的最优控制问题的SUPG方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

几何结构形变空间的几何拓扑

国家自然科学基金

0+阅读 · 2012年12月31日

STIM1突变与核浆钙信号调控

国家自然科学基金

0+阅读 · 2012年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员