State uncertainty poses a major challenge for decentralized coordination but is largely neglected in state-of-the-art research due to a strong focus on state-based centralized training for decentralized execution (CTDE) and benchmarks that lack sufficient stochasticity like StarCraft Multi-Agent Challenge (SMAC). In this paper, we propose Attention-based Embeddings of Recurrence In multi-Agent Learning (AERIAL) to approximate value functions under agent-wise state uncertainty. AERIAL replaces the true state with a learned representation of multi-agent recurrence, considering more accurate information about decentralized agent decisions than state-based CTDE. We then introduce MessySMAC, a modified version of SMAC with stochastic observations and higher variance in initial states, to provide a more general and configurable benchmark regarding state uncertainty. We evaluate AERIAL in Dec-Tiger as well as in a variety of SMAC and MessySMAC maps, and compare the results with state-based CTDE. Furthermore, we evaluate the robustness of AERIAL and state-based CTDE against various state uncertainty configurations in MessySMAC.
翻译:国家不确定性对分散协调构成重大挑战,但在最先进的研究中,国家不确定性在很大程度上被忽略,原因是重点强调国家集中培训,以分散执行(CTDE)和基准缺乏足够的随机性,如StarCraft多机构挑战(SMAC)等。在本文中,我们建议多机构学习(AERIAL)中基于关注的反复嵌入,以在代理国不确定性下接近价值功能。AERIal用多剂复发的学术代表来取代真实状态,考虑到关于分散代理决定的准确信息比基于国家的CTDE更准确。我们随后引入了MessySMAC,即经修改的SMAC版本,具有随机性观察和初始州差异较大,以提供关于国家不确定性的更宽泛和可配置基准。我们评估了在Dec-Tieger以及各种SMAC和MessySMAC地图中的AIR,并将结果与基于州CTDE的地图进行比较。此外,我们评估了AERIal和州立的CTDE的稳健性,与MISMAC的各种州不确定性配置相比。