We investigate a transmitter-receiver pair in a slotted-time system. The transmitter observes a dynamic source and sends updates to a remote receiver through a communication channel. We assume that the channel is error-free but suffers a random delay. We consider two more practical cases to facilitate the analysis. In the first case, the update is guaranteed to be delivered within a certain number of time slots. In the second case, once the transmission time exceeds a predetermined value, the update is immediately discarded, leaving the channel free for a new transmission on demand. The receiver will maintain an estimate of the current state of the dynamic source using the received updates. In this paper, we adopt the Age of Incorrect Information (AoII) as the performance metric and investigate the problem of optimizing the transmitter's action in each time slot to minimize AoII. We first characterize the optimization problem using the Markov decision process and investigate the performance of the threshold policy, under which the transmitter transmits updates only when the AoII exceeds the threshold $\tau$. By delving into the characteristics of the system evolution, we precisely compute the expected AoII achieved by the threshold policy using the Markov chain. Then, we prove that the optimal policy exists and provide a computable relative value iteration algorithm to estimate the optimal policy. Next, by leveraging the policy improvement theorem, we prove that, under an easy-to-verify condition, the optimal policy is the threshold policy with $\tau=1$. Finally, numerical results are laid out to highlight the performance of the optimal policy.
翻译:我们在一个时间档系统中调查一个发报机接收器配对。 发报机观察一个动态源, 并通过通信频道向远程接收器发送更新信息。 我们假设频道没有错误, 但会受到随机延误。 我们考虑另外两个实际案例, 以便于分析。 在第一个案例中, 更新会保证在一定时间档内交付。 在第二个案例中, 一旦传输时间超过预定值, 更新会立即被丢弃, 使频道可以免费按需进行新的传输。 接收器将保持对动态源当前状况的估计, 使用收到的更新。 在本文件中, 我们采用错误信息时代( AoII) 作为性能衡量标准, 并调查每个时间档优化发报机行动的问题, 以尽量减少 AoII 。 我们首先使用Markov 决策程序来描述最优化问题, 并调查阈值政策的业绩, 只有在AoII 超过门槛值时, 更新频道可以免费发送更新。 通过分解系统演变的特性, 我们精确地将预期 AoII 值 值 值 值 算出最佳政策 。