时间和空间中与时空相关感应器感应器的深入强化能源意识学习计划 (Energy Aware Deep Reinforcement Learning Scheduling for Sensors Correlated in Time and Space)

Millions of battery-powered sensors deployed for monitoring purposes in a multitude of scenarios, e.g., agriculture, smart cities, industry, etc., require energy-efficient solutions to prolong their lifetime. When these sensors observe a phenomenon distributed in space and evolving in time, it is expected that collected observations will be correlated in time and space. In this paper, we propose a Deep Reinforcement Learning (DRL) based scheduling mechanism capable of taking advantage of correlated information. We design our solution using the Deep Deterministic Policy Gradient (DDPG) algorithm. The proposed mechanism is capable of determining the frequency with which sensors should transmit their updates, to ensure accurate collection of observations, while simultaneously considering the energy available. To evaluate our scheduling mechanism, we use multiple datasets containing environmental observations obtained in multiple real deployments. The real observations enable us to model the environment with which the mechanism interacts as realistically as possible. We show that our solution can significantly extend the sensors' lifetime. We compare our mechanism to an idealized, all-knowing scheduler to demonstrate that its performance is near-optimal. Additionally, we highlight the unique feature of our design, energy-awareness, by displaying the impact of sensors' energy levels on the frequency of updates.

翻译：在多种情况下,例如农业、智能城市、工业等,为监测目的而部署的数百万个电池动力传感器需要节能的解决方案来延长其寿命。当这些传感器观测空间分布的现象并随着时间的演变而变化时,预计收集的观测在时间和空间上是相互关联的。在本文件中,我们提议一个基于深强化学习(DRL)的时间安排机制,能够利用相关信息。我们使用深确定性政策渐进算法设计我们的解决方案。拟议的机制能够确定传感器传送其更新的频率,以确保准确收集观测结果,同时考虑现有能源。为了评估我们的时间安排机制,我们使用包含在多种实际部署中获得的环境观测的多套数据集。实际观测使我们能够模拟机制与现实互动的环境。我们表明,我们的解决方案可以大大延长传感器的寿命。我们将我们的机制与一个理想化的、完全知情的调度器进行比较,以表明其性能接近最佳。此外,我们通过显示传感器的频率影响,突出我们设计、能源意识的特性。