Whittle index is a generalization of Gittins index that provides very efficient allocation rules for restless multiarmed bandits. In this paper, we develop an algorithm to test the indexability and compute the Whittle indices of any finite-state Markovian bandit problem. This algorithm works in the discounted and non-discounted cases. As a byproduct, it can also be used to compute Gittins index. Our algorithm builds on three tools: (1) a careful characterization of Whittle index that allows one to compute recursively the th smallest index from the (-- 1)th smallest, and to test indexability, (2) the use of Sherman-Morrison formula to make this recursive computation efficient, and (3) a sporadic use of fast matrix inversion and multiplication to obtain a subcubic complexity. We show that an efficient use of the Sherman-Morrison formula leads to an algorithm that computes Whittle index in (2$\Uparrow$3) 3 + (3) arithmetic operations, where is the number of states of the arm. The careful use of fast matrix multiplication leads to the first subcubic algorithm to compute Whittle (or Gittins) index. By using the current fastest matrix multiplications, our algorithm runs in (2.5286). We also conduct a series of experiments that demonstrate that our algorithm is very efficient in practice and can compute indices of Markov chains with several thousands of states in a few seconds.
翻译:Whittle 指数是Gittins 指数的概括化, 它为不平和的多武装匪徒提供了非常高效的分配规则。 在本文中, 我们开发了一种算法, 测试指数的可索引性, 并计算任何有限度的Markovian 土匪问题的惠特尔指数。 这个算法在折扣和非折扣的案例中起作用。 作为副产品, 也可以使用它来计算 Gittins 指数。 我们的算法建立在三个工具上:(1) 仔细描述惠特尔指数的特性, 使人们能够将最小的(-1) 最小的最小指数(- 1) 和可索引的可比较性进行反复计算; (2) 使用谢尔曼- 莫里森公式来使这种循环计算效率更高。 (3) 使用快速矩阵的转换和倍增法来获得亚的复杂程度。 我们的谢尔曼- 莫里森公式的高效使用导致一种算法, 将惠特尔指数( 2 美元\ Unorrow 3 3) 3 + (3) 算算操作, 其中是手臂的数个状态。 。 仔细使用快速矩阵的递增引导导致这个循环的循环计算, 我们的快速矩阵的快速矩阵的计算, 正在进行中, 我们的快速矩阵的计算。