In most Internet of Things (IoT) networks, edge nodes are commonly used as to relays to cache sensing data generated by IoT sensors as well as provide communication services for data consumers. However, a critical issue of IoT sensing is that data are usually transient, which necessitates temporal updates of caching content items while frequent cache updates could lead to considerable energy cost and challenge the lifetime of IoT sensors. To address this issue, we adopt the Age of Information (AoI) to quantify data freshness and propose an online cache update scheme to obtain an effective tradeoff between the average AoI and energy cost. Specifically, we first develop a characterization of transmission energy consumption at IoT sensors by incorporating a successful transmission condition. Then, we model cache updating as a Markov decision process to minimize average weighted cost with judicious definitions of state, action, and reward. Since user preference towards content items is usually unknown and often temporally evolving, we therefore develop a deep reinforcement learning (DRL) algorithm to enable intelligent cache updates. Through trial-and-error explorations, an effective caching policy can be learned without requiring exact knowledge of content popularity. Simulation results demonstrate the superiority of the proposed framework.
 翻译:在大多数Tings(IoT)互联网网络中,通常使用边缘节点作为向IoT传感器生成的缓存感测数据的中继器,并为数据消费者提供通信服务,然而,IoT遥感的一个关键问题是,数据通常是暂时性的,这就需要对缓存内容进行时间更新,而经常的缓存更新则可能导致巨大的能源成本,对IoT传感器的寿命构成挑战。为了解决这一问题,我们采用了信息时代(AoI)来量化数据新鲜度,并提出在线缓存更新计划,以便在平均AoI和能源成本之间实现有效的平衡。具体地说,我们首先通过纳入成功的传输条件来对IoT传感器传输能源消耗的特点进行定性。然后,我们将数据缓存模式作为Markov决策程序,以便根据明智的状态、行动和奖励定义来尽量减少平均加权成本。由于用户对内容的偏好通常不为人所知,而且往往在时间上演化,因此我们开发了一种深加固学习(DRL)算法,以便能够对智能缓存器进行更新。通过试验和eror探索,一项有效的缓存政策可以在不需要准确了解内容的优越性框架的情况下学习。