Unmanned aerial vehicles (UAVs) have the potential to greatly aid Internet of Things (IoT) networks in mission-critical data collection, thanks to their flexibility and cost-effectiveness. However, challenges arise due to the UAV's limited onboard energy and the unpredictable status updates from sensor nodes (SNs), which impact the freshness of collected data. In this paper, we investigate the energy-efficient and timely data collection in IoT networks through the use of a solar-powered UAV. Each SN generates status updates at stochastic intervals, while the UAV collects and subsequently transmits these status updates to a central data center. Furthermore, the UAV harnesses solar energy from the environment to maintain its energy level above a predetermined threshold. To minimize both the average age of information (AoI) for SNs and the energy consumption of the UAV, we jointly optimize the UAV trajectory, SN scheduling, and offloading strategy. Then, we formulate this problem as a Markov decision process (MDP) and propose a meta-reinforcement learning algorithm to enhance the generalization capability. Specifically, the compound-action deep reinforcement learning (CADRL) algorithm is proposed to handle the discrete decisions related to SN scheduling and the UAV's offloading policy, as well as the continuous control of UAV flight. Moreover, we incorporate meta-learning into CADRL to improve the adaptability of the learned policy to new tasks. To validate the effectiveness of our proposed algorithms, we conduct extensive simulations and demonstrate their superiority over other baseline algorithms.
翻译:暂无翻译