We consider combinatorial semi-bandits with uncorrelated Gaussian rewards. In this article, we propose the first method, to the best of our knowledge, that enables to compute the solution of the Graves-Lai optimization problem in polynomial time for many combinatorial structures of interest. In turn, this immediately yields the first known approach to implement asymptotically optimal algorithms in polynomial time for combinatorial semi-bandits.
翻译:我们考虑的是具有与高斯无关联的奖赏的组合半大宗。 在本文中,我们建议了第一种方法,根据我们的知识,能够计算多种组合结构感兴趣的多种组合结构在多元时间的格雷夫斯-拉伊优化问题的解决办法。 反过来,这立即产生了第一个已知的方法,在复合时间对组合半大宗组合实施非同步最佳算法。