While deep neural networks (DNNs) have strengthened the performance of cooperative multi-agent reinforcement learning (c-MARL), the agent policy can be easily perturbed by adversarial examples. Considering the safety critical applications of c-MARL, such as traffic management, power management and unmanned aerial vehicle control, it is crucial to test the robustness of c-MARL algorithm before it was deployed in reality. Existing adversarial attacks for MARL could be used for testing, but is limited to one robustness aspects (e.g., reward, state, action), while c-MARL model could be attacked from any aspect. To overcome the challenge, we propose MARLSafe, the first robustness testing framework for c-MARL algorithms. First, motivated by Markov Decision Process (MDP), MARLSafe consider the robustness of c-MARL algorithms comprehensively from three aspects, namely state robustness, action robustness and reward robustness. Any c-MARL algorithm must simultaneously satisfy these robustness aspects to be considered secure. Second, due to the scarceness of c-MARL attack, we propose c-MARL attacks as robustness testing algorithms from multiple aspects. Experiments on \textit{SMAC} environment reveals that many state-of-the-art c-MARL algorithms are of low robustness in all aspect, pointing out the urgent need to test and enhance robustness of c-MARL algorithms.
翻译:虽然深度神经网络(DNNS)加强了多剂强化合作学习(c-MARL)的绩效,但代理政策很容易受到对抗性实例的干扰。考虑到海运合作(c-MARL)的安全关键应用,例如交通管理、电力管理和无人驾驶航空飞行器控制,至关重要的是在实际部署之前测试海运合作(c-MARL)的算法的稳健性。MARL的现有对抗性攻击可用于测试,但限于一个稳健性方面(例如,奖励、状态、行动),而海运合作(c-MARL)模式可以从任何方面受到攻击。为了克服挑战,我们提议MARLsafe,即海运合作(c-MARL)算法的第一个稳健性测试框架。首先,由Markov决定程序(MDP)推动,MARsafe考虑海运合作(c)算法的稳健性,从三个方面,即状态强性、行动强性和奖励强性。任何海运合作(c-MAR)算法必须同时满足这些稳健性方面,才能认为安全性。第二,由于海运合作攻击的紧缺性攻击,我们提议将CAR-MARNL攻击作为稳性试验/MARL)的多次测试的稳性,从环境测试显示稳性测试环境的多方面。