In many emerging Internet of Things (IoT) applications, the freshness of the is an important design criterion. Age of Information (AoI) quantifies the freshness of the received information or status update. This work considers a setup of deployed IoT devices in an IoT network; multiple unmanned aerial vehicles (UAVs) serve as mobile relay nodes between the sensors and the base station. We formulate an optimization problem to jointly plan the UAVs' trajectory, while minimizing the AoI of the received messages and the devices' energy consumption. The solution accounts for the UAVs' battery lifetime and flight time to recharging depots to ensure the UAVs' green operation. The complex optimization problem is efficiently solved using a deep reinforcement learning algorithm. In particular, we propose a deep Q-network, which works as a function approximation to estimate the state-action value function. The proposed scheme is quick to converge and results in a lower ergodic age and ergodic energy consumption when compared with benchmark algorithms such as greedy algorithm (GA), nearest neighbour (NN), and random-walk (RW).
翻译:在许多新兴的东西互联网(IoT)应用中,该应用的新鲜度是一个重要的设计标准。信息时代(AoI)量化了所收到信息或现状更新的新鲜度。这项工作考虑在IoT网络中设置部署的IoT装置;多无人驾驶飞行器(UAVs)作为传感器和基地站之间的移动中继节点。我们提出了一个优化问题,以联合规划UAVs的轨迹,同时尽量减少所收到信息的AoI和装置的能量消耗。UAVs电池寿命和飞行时间的计算,以确保UAVS的绿色运行。复杂的优化问题通过深度强化学习算法得到有效解决。特别是,我们提议了一个深度的Q-网络,作为估算国家行动价值函数的函数的近似值。拟议办法很快会趋同并导致较低的ergodic年龄和神经能源消耗,与诸如贪婪算法(GA)、近邻(NNN)和随机行(RW)等基准算法相比。