We consider the problem of Combinatorial Pure Exploration (CPE), which deals with finding a combinatorial set or arms with a high reward, when the rewards of individual arms are unknown in advance and must be estimated using arm pulls. Previous algorithms for this problem, while obtaining sample complexity reductions in many cases, are highly computationally intensive, thus making them impractical even for mildly large problems. In this work, we propose a new CPE algorithm in the PAC setting, which is computationally light weight, and so can easily be applied to problems with tens of thousands of arms. This is achieved since the proposed algorithm requires a very small number of combinatorial oracle calls. The algorithm is based on successive acceptance of arms, along with elimination which is based on the combinatorial structure of the problem. We provide sample complexity guarantees for our algorithm, and demonstrate in experiments its usefulness on large problems, whereas previous algorithms are impractical to run on problems of even a few dozen arms. The code for the algorithms and experiments is provided at https://github.com/noabdavid/csale.
翻译:我们考虑的是混合纯勘探(CPE)问题,它涉及的是寻找一组武器或具有高报酬的武器,而个别武器的奖励事先不为人知,必须用手臂拉来估计。以前对这一问题的算法虽然在许多情况下获得样品复杂性的减少,但具有很高的计算密集性,因此即使轻微的较大问题也使它们不切实际。在这项工作中,我们提议在PAC设置中采用一种新的CPE算法,即计算性轻重,因此可以很容易地适用于数万件武器的问题。这是由于提议的算法需要非常少的组合手动电话而实现的。这种算法的基础是连续接受武器,同时根据问题的组合结构来消除武器。我们为我们的算法提供抽样复杂性保证,并在实验中证明它对大问题的用处,而以前的算法甚至对几十件武器的问题都不切实际操作。计算法和实验的代码在https://github.com/noabdavid/cselle上提供。