Recently Reinforcement Learning (RL) has been applied as an anti-adversarial remedy in wireless communication networks. However, studying the RL-based approaches from the adversary's perspective has received little attention. Additionally, RL-based approaches in an anti-adversary or adversarial paradigm mostly consider single-channel communication (either channel selection or single channel power control), while multi-channel communication is more common in practice. In this paper, we propose a multi-agent adversary system (MAAS) for modeling and analyzing adversaries in a wireless communication scenario by careful design of the reward function under realistic communication scenarios. In particular, by modeling the adversaries as learning agents, we show that the proposed MAAS is able to successfully choose the transmitted channel(s) and their respective allocated power(s) without any prior knowledge of the sender strategy. Compared to the single-agent adversary (SAA), multi-agents in MAAS can achieve significant reduction in signal-to-noise ratio (SINR) under the same power constraints and partial observability, while providing improved stability and a more efficient learning process. Moreover, through empirical studies we show that the results in simulation are close to the ones in communication in reality, a conclusion that is pivotal to the validity of performance of agents evaluated in simulations.
翻译:最近加强学习(RL)在无线通信网络中作为一种反对抗的补救办法已经应用,然而,从对手的角度研究以敌国为基础的RL为基础的办法很少受到注意;此外,在反逆或对抗范式中,基于RL的办法大多考虑单通道通信(频道选择或单一频道电力控制),而多渠道通信在实践中更为常见;在本文件中,我们提议在无线通信情景中采用多试剂对抗系统(MAAS)进行模拟和分析对手,在现实的通信情景下仔细设计奖励功能;特别是,将对手作为学习代理人进行模拟,我们表明拟议的MAAS能够成功地选择传输渠道及其各自的分配权力,而事先对发送者战略一无所知。与单一代理对手(SAA)相比,MAAS的多代理人可以在同样的权力制约和部分可耐性下,在无线通信中大幅降低信号对音频比率,同时提供更好的稳定性和更有效的学习过程。此外,通过实证研究,我们通过模拟的通信结果接近于模拟,我们显示模拟中的关键性是真实性结论。