We consider the problem of optimizing the decision of a preemptive transmitter to minimize the Age of Incorrect Information (AoII) when the channel has a random delay. In the system, a transmitter observes a dynamic source and makes decisions based on the system status. When the channel is busy, the transmitter can choose whether to preempt to transmit a new update. When the channel is idle, the transmitter can choose whether or not to transmit a new update. We assume that the channel has a random delay and that this delay is independent and identically distributed for any update. At the other end of the channel is a receiver that estimates the state of the dynamic source based on the updates it receives. We adopt AoII to measure the performance of the system. Therefore, this paper aims to optimize the transmitter's action in each time slot to minimize AoII. We first use the Markov decision process to formulate the optimization problem and give the corresponding value iterative algorithm to obtain the optimal policy. However, the value iteration algorithm is computationally demanding, and some approximations are made to realize the algorithm. Hence, we theoretically analyze some canonical delay distributions and obtain the corresponding optimal policies by leveraging the policy improvement theorem.
翻译:我们考虑了在频道有随机延迟的情况下优化先发制人决定以尽量减少错误信息时代(AoII)的问题。 在系统中, 发件人观察动态源, 并根据系统状态做出决定。 当频道繁忙时, 发件人可以选择是否先发制人发送新的更新。 当频道闲置时, 发件人可以选择是否传送新的更新信息。 我们假设频道有随机延迟, 而这种延迟是独立的, 并且为任何更新提供相同的分配 。 在频道的另一端, 是一个接收人, 根据它收到的更新来估计动态源的状况 。 我们采用 AoII 来测量系统的性能 。 因此, 该文件旨在优化发件人在每一个时段的动作, 以最大限度地减少 AoII 。 我们首先使用 Markov 决策程序来制定优化问题, 并给出相应的值迭代算法以获得最佳政策 。 然而, 其算法要求很高, 并且为了实现算法, 也做了一些近似的方法 。 因此, 我们从理论上分析一些可比较的延迟分发和获取相应的最佳政策 。