Whittle index is a generalization of Gittins index that provides very efficient allocation rules for restless multi-armed bandits. In this work, we develop an algorithm to test the indexability and compute the Whittle indices of any finite-state restless bandit arm. This algorithm works in the discounted and non-discounted cases, and can compute Gittins index. Our algorithm builds on three tools: (1) a careful characterization of Whittle index that allows one to compute recursively the kth smallest index from the (k -- 1)th smallest, and to test indexability, (2) the use of the Sherman-Morrison formula to make this recursive computation efficient, and (3) a sporadic use of the fastest matrix inversion and multiplication methods to obtain a subcubic complexity. We show that an efficient use of the Sherman-Morrison formula leads to an algorithm that computes Whittle index in (2/3)n^3 + o(n^3 ) arithmetic operations, where n is the number of states of the arm. The careful use of fast matrix multiplication leads to the first subcubic algorithm to compute Whittle or Gittins index: By using the current fastest matrix multiplication, the theoretical complexity of our algorithm is O(n^2.5286 ). We also develop an efficient implementation of our algorithm that can compute indices of Markov chains with several thousands of states in less than a few seconds.
翻译:Whittle指数是Gittins指数的一种推广,为不安静的多臂赌博机提供了非常高效的分配规则。在这项工作中,我们开发了一种算法,用于测试任何有限状态的不安静的赌博机臂的可索引性并计算其Whittle指数。该算法适用于折扣和非折扣情况,并且可以计算Gittins指数。我们的算法建立在三个工具上:(1)仔细描述Whittle指数,允许您从第(k-1)个最小下标递归地计算第k个最小下标,并测试可索引性,(2)使用Sherman-Morrison公式使这种递归计算高效,并且(3)偶尔使用最快的矩阵求逆和乘法方法以获得次立方复杂度。我们表明,有效地使用Sherman-Morrison公式会导致算法在(2/3)n^3 + o(n^3)算术操作中计算Whittle指数,其中n是该臂的状态数。对快速矩阵乘法的谨慎使用导致了第一个计算Whittle或Gittins指数的复杂度小于次立方级别的算法:通过使用当前最快的矩阵乘法,我们的算法的理论复杂度为O(n^2.5286)。我们还开发了一种高效的实现,可以在不到几秒钟的时间内计算具有数千个状态的马尔可夫链的指数。