Age of incorrect information (AoII) is a recently proposed freshness and mismatch metric that penalizes an incorrect estimation along with its duration. Therefore, keeping track of AoII requires the knowledge of both the source and estimation processes. In this paper, we consider a time-slotted pull-based remote estimation system under a sampling rate constraint where the information source is a general discrete-time Markov chain (DTMC) process. Moreover, packet transmission times from the source to the monitor are non-zero which disallows the monitor to have perfect information on the actual AoII process at any time. Hence, for this pull-based system, we propose the monitor to maintain a sufficient statistic called {\em belief} which stands for the joint distribution of the age and source processes to be obtained from the history of all observations. Using belief, we first propose a maximum a posteriori (MAP) estimator to be used at the monitor as opposed to existing martingale estimators in the literature. Second, we obtain the optimality equations from the belief-MDP (Markov decision process) formulation. Finally, we propose two belief-dependent policies one of which is based on deep reinforcement learning, and the other one is a threshold-based policy based on the instantaneous expected AoII.
翻译:暂无翻译