Distributed access control is a crucial component for massive machine type communication (mMTC). In this communication scenario, centralized resource allocation is not scalable because resource configurations have to be sent frequently from the base station to a massive number of devices. We investigate distributed reinforcement learning for resource selection without relying on centralized control. Another important feature of mMTC is the sporadic and dynamic change of traffic. Existing studies on distributed access control assume that traffic load is static or they are able to gradually adapt to the dynamic traffic. We minimize the adaptation period by training TinyQMIX, which is a lightweight multi-agent deep reinforcement learning model, to learn a distributed wireless resource selection policy under various traffic patterns before deployment. Therefore, the trained agents are able to quickly adapt to dynamic traffic and provide low access delay. Numerical results are presented to support our claims.
翻译:分布式出入控制是大规模机器型通信的关键组成部分。在这种通信情况下,中央资源分配是不可扩展的,因为资源配置必须经常从基地站发送到大量设备。我们调查在不依赖集中式控制的情况下为资源选择而分配强化学习;移动式出入控制的另一个重要特点是交通零星和动态变化。关于分布式出入控制的现有研究假定交通负荷是静态的,或者能够逐步适应动态交通。我们通过培训TinyQMIX(这是一种轻量的多试剂深度强化学习模式)来尽量减少适应期,以学习在部署前的各种交通模式下分配的无线资源选择政策。因此,经过培训的代理能够迅速适应动态交通,提供低速度的准入延误。提供了数字结果来支持我们的要求。