Deep reinforcement learning (DRL) has recently been used to perform efficient resource allocation in wireless communications. In this paper, the vulnerabilities of such DRL agents to adversarial attacks is studied. In particular, we consider multiple DRL agents that perform both dynamic channel access and power control in wireless interference channels. For these victim DRL agents, we design a jammer, which is also a DRL agent. We propose an adversarial jamming attack scheme that utilizes a listening phase and significantly degrades the users' sum rate. Subsequently, we develop an ensemble policy defense strategy against such a jamming attacker by reloading models (saved during retraining) that have minimum transition correlation.
翻译:深层强化学习( DRL) 最近被用于在无线通信中高效分配资源。 在本文中, 研究了这种DRL代理器在对抗性攻击中的脆弱性。 特别是, 我们考虑在无线干扰频道中执行动态通道接入和电源控制的多个DRL代理器。 对于这些受害人的DRL代理器, 我们设计了一个干扰器, 这也是DRL代理器。 我们提出一个对抗性干扰袭击计划, 利用监听阶段, 并显著降低用户的总和率。 随后, 我们制定了一个联合政策防御战略, 通过重新加载具有最小过渡相关性的模型( 在再培训期间保存 ) 来对付这种干扰攻击器。