Growing at a fast pace, modern autonomous systems will soon be deployed at scale, opening up the possibility for cooperative multi-agent systems. Sharing information and distributing workloads allow autonomous agents to better perform tasks and increase computation efficiency. However, shared information can be modified to execute adversarial attacks on deep learning models that are widely employed in modern systems. Thus, we aim to study the robustness of such systems and focus on exploring adversarial attacks in a novel multi-agent setting where communication is done through sharing learned intermediate representations of neural networks. We observe that an indistinguishable adversarial message can severely degrade performance, but becomes weaker as the number of benign agents increases. Furthermore, we show that black-box transfer attacks are more difficult in this setting when compared to directly perturbing the inputs, as it is necessary to align the distribution of learned representations with domain adaptation. Our work studies robustness at the neural network level to contribute an additional layer of fault tolerance to modern security protocols for more secure multi-agent systems.
翻译:现代自主系统将迅速发展,将很快大规模部署,为合作性多试剂系统开辟可能性。分享信息和分配工作量可以使自主代理更好地执行任务,提高计算效率。然而,可以修改共享信息,对现代系统广泛使用的深学习模式进行对抗性攻击。因此,我们的目标是研究这些系统是否稳健,并侧重于在新型多试剂环境中探索对抗性攻击,通过交流神经网络的学习中间表现进行交流。我们发现,不可区分的对立信息会严重削弱性能,但随着良剂数量的增加而变弱。此外,我们表明,与直接干扰投入相比,黑盒转移攻击在此环境中更为困难,因为有必要使学到的表达方式与领域适应保持一致。我们的工作研究神经网络层面的强力性,以便为更安全的多试剂系统对现代安全协议增加一层的过失容忍度。