Deep reinforcement learning offers a model-free alternative to supervised deep learning and classical optimization for solving the transmit power control problem in wireless networks. The multi-agent deep reinforcement learning approach considers each transmitter as an individual learning agent that determines its transmit power level by observing the local wireless environment. Following a certain policy, these agents learn to collaboratively maximize a global objective, e.g., a sum-rate utility function. This multi-agent scheme is easily scalable and practically applicable to large-scale cellular networks. In this work, we present a distributively executed continuous power control algorithm with the help of deep actor-critic learning, and more specifically, by adapting deep deterministic policy gradient. Furthermore, we integrate the proposed power control algorithm to a time-slotted system where devices are mobile and channel conditions change rapidly. We demonstrate the functionality of the proposed algorithm using simulation results.
翻译:深层强化学习提供了一种无模式的替代模式,以取代监督的深层次学习和经典优化,解决无线网络传输电源控制问题。多剂深层强化学习方法将每个发射机视为通过观察当地无线环境决定其传输功率的单个学习剂。按照某种政策,这些发射机学会合作最大化全球目标,例如一个总率的通用功能。这个多试剂计划很容易推广,而且实际上适用于大型蜂窝网络。在这项工作中,我们借助深层的行为者-批评学习,更具体地说,通过调整深度的确定性政策梯度,提出了分流连续电源控制算法。此外,我们把拟议的电源控制算法整合到一个定时制系统,在该系统中,设备是移动的,频道条件迅速变化。我们用模拟结果展示了拟议算法的功能。