In this paper, we focus on a wireless-powered sensor network coordinated by a multi-antenna access point (AP). Each node can generate sensing information and report the latest information to the AP using the energy harvested from the AP's signal beamforming. We aim to minimize the average age-of-information (AoI) by adapting the nodes' transmission scheduling and the transmission control strategies jointly. To reduce the transmission delay, an intelligent reflecting surface (IRS) is used to enhance the channel conditions by controlling the AP's beamforming vector and the IRS's phase shifting matrix. Considering dynamic data arrivals at different sensing nodes, we propose a hierarchical deep reinforcement learning (DRL) framework to for AoI minimization in two steps. The users' transmission scheduling is firstly determined by the outer-loop DRL approach, e.g. the DQN or PPO algorithm, and then the inner-loop optimization is used to adapt either the uplink information transmission or downlink energy transfer to all nodes. A simple and efficient approximation is also proposed to reduce the inner-loop rum time overhead. Numerical results verify that the hierarchical learning framework outperforms typical baselines in terms of the average AoI and proportional fairness among different nodes.
翻译:在本文中,我们侧重于由多ANETNA接入点(AP)协调的无线动力传感器网络。每个节点都能利用从APAP信号光束中获取的能量生成遥感信息并向AP报告最新信息。我们的目标是通过调整节点传输时间安排和传输控制战略,最大限度地降低信息的平均年龄(AoI)。为减少传输延迟,将智能反射表面(IRS)用于通过控制AP的波束成型矢量和IRS的相位转换矩阵来改善频道条件。考虑到在不同感测节点收到的动态数据,我们建议采用一个等级深层强化学习框架,以在两个步骤中尽量减少AOI。用户的传输时间安排首先由外环流DRL方法(例如DQN或PO算法)确定,然后使用内环优化来将上链信息传输或下链接能量传输调整到所有节点。还提出一个简单有效的近称,以降低内环流时间顶部的内流频率。Numerasimal-assimal asiming astrical frames divicaltium astricaltium expralformagy) 。