The recent interweaving of AI-6G technologies has sparked extensive research interest in further enhancing reliable and timely communications. \emph{Age of Information} (AoI), as a novel and integrated metric implying the intricate trade-offs among reliability, latency, and update frequency, has been well-researched since its conception. This paper contributes new results in this area by employing a Deep Reinforcement Learning (DRL) approach to intelligently decide how to allocate power resources and when to retransmit in a \emph{freshness-sensitive} downlink multi-user Hybrid Automatic Repeat reQuest with Chase Combining (HARQ-CC) aided Non-Orthogonal Multiple Access (NOMA) network. Specifically, an AoI minimization problem is formulated as a Markov Decision Process (MDP) problem. Then, to achieve deterministic, age-optimal, and intelligent power allocations and retransmission decisions, the Double-Dueling-Deep Q Network (DQN) is adopted. Furthermore, a more flexible retransmission scheme, referred to as Retransmit-At-Will scheme, is proposed to further facilitate the timeliness of the HARQ-aided NOMA network. Simulation results verify the superiority of the proposed intelligent scheme and demonstrate the threshold structure of the retransmission policy. Also, answers to whether user pairing is necessary are discussed by extensive simulation results.
翻译:近年来,人工智能(AI)和第六代移动通信(6G)技术的融合引发了学术界对进一步提高可靠和及时通信的广泛研究兴趣。信息时代(AoI)是一项新颖的综合指标,涵盖了可靠性、延迟和更新频率之间复杂的权衡,自从提出以来,已经得到了广泛的研究。本文通过采用深度强化学习(DRL)方法,智能化地决定如何分配功率资源和何时在[HARQ-CC] NOMA多用户下行信道中进行重传,从而在此领域做出了新的贡献。具体而言,将时延最小化问题形式化为马尔可夫决策过程(MDP)问题。然后,采用双重深度决斗[Q]网络(Dueling-Deep Q Network,简称DQN)来实现确定性、时延最优、智能化的功率分配和重传决策。此外,还提出了一种更灵活的重传方案,称为Retransmit-At-Will方案,以进一步促进[HARQ-CC] NOMA网络的及时性。仿真结果验证了所提出的智能化方案的优越性,并展示了重传策略的门限结构。此外,通过广泛的仿真结果,讨论了是否需要用户配对的问题。