This paper focuses on the information freshness of finite-state Markov sources, using the uncertainty of information (UoI) as the performance metric. Measured by Shannon's entropy, UoI can capture not only the transition dynamics of the Markov source but also the different evolutions of information quality caused by the different values of the last observation. We consider an information update system with M finite-state Markov sources transmitting information to a remote monitor via m communication channels. Our goal is to explore the optimal scheduling policy to minimize the sum-UoI of the Markov sources. The problem is formulated as a restless multi-armed bandit (RMAB). We relax the RMAB and then decouple the relaxed problem into M single bandit problems. Analyzing the single bandit problem provides useful properties with which the relaxed problem reduces to maximizing a concave and piecewise linear function, allowing us to develop a gradient method to solve the relaxed problem and obtain its optimal policy. By rounding up the optimal policy for the relaxed problem, we obtain an index policy for the original RMAB problem. Notably, the proposed index policy is universal in the sense that it applies to general RMABs with bounded cost functions.
翻译:本文侧重于使用信息不确定性(UoI)作为性能衡量尺度的限定状态Markov来源的信息新鲜度。根据香农的昆虫测量,UoI不仅可以捕捉Markov源的过渡动态,还可以捕捉最后观测的不同值导致的信息质量的不同演变。我们考虑由M-state Markov来源组成的信息更新系统,通过M通信渠道将信息传送到远程监视器。我们的目标是探索最佳时间安排政策,尽量减少Markov来源的总量(UoI)作为性能衡量标准。问题是一个无休止的多臂强盗(RMAB)的形成。我们放松RMAB,然后将宽松的问题分解为M单带问题。分析单一条状问题提供了有用的特性,使宽松的问题降低到最大限度的连接和线性功能,使我们能够开发一种梯度方法来解决宽松的问题并获得最佳的政策。我们通过对宽松问题的最佳政策进行四舍五入,我们获得了一个原始的RBAB问题指数政策。值得注意的是,拟议的指数政策具有普遍意义,它适用于一般的RM。