This paper investigates goal-oriented communication for remote estimation of multiple Markov sources in resource-constrained networks. An agent selects the update order of the sources and transmits the packet to a remote destination over an unreliable delay channel. The destination is tasked with source reconstruction for the purpose of actuation. We utilize the metric cost of actuation error (CAE) to capture the significance (semantics) of error at the point of actuation. We aim to find an optimal sampling policy that minimizes the time-averaged CAE subject to average resource constraints. We formulate this problem as an average-cost constrained Markov Decision Process (CMDP) and transform it into an unconstrained MDP by utilizing Lyapunov drift techniques. Then, we propose a low-complexity drift-plus-penalty(DPP) policy for systems with known source/channel statistics and a Lyapunov optimization-based deep reinforcement learning (LO-DRL) policy for unknown environments. Our policies achieve near-optimal performance in CAE minimization and significantly reduce the number of uninformative transmissions.
翻译:暂无翻译