In this paper, we apply an multi-agent reinforcement learning (MARL) framework allowing the base station (BS) and the user equipments (UEs) to jointly learn a channel access policy and its signaling in a wireless multiple access scenario. In this framework, the BS and UEs are reinforcement learning (RL) agents that need to cooperate in order to deliver data. The comparison with a contention-free and a contention-based baselines shows that our framework achieves a superior performance in terms of goodput even in high traffic situations while maintaining a low collision rate. The scalability of the proposed method is studied, since it is a major problem in MARL and this paper provides the first results in order to address it.
翻译:在本文中,我们采用了多试剂强化学习(MARL)框架,使基地站和用户设备(UES)能够在无线多重访问情况下共同学习频道接入政策及其信号;在这个框架内,BS和UES是需要合作才能提供数据的强化学习(RL)代理物;与无争议和基于争议基线的比较表明,即使在交通量高的情况下,我们的框架也取得了良好的性能,同时保持低碰撞率;研究了拟议方法的可扩展性,因为这是MARL的一个主要问题,本文为解决这一问题提供了第一批结果。