Stochastic high dimensional bandit problems with low dimensional structure are useful in different applications such as online advertising and drug discovery. In this work, we propose a simple unified algorithm for such problems and present a general analysis framework for the regret upper bound of our algorithm. We show that under some mild unified assumptions, our algorithm can be applied to different high dimensional bandit problems. Our framework utilizes the low dimensional structure to guide the parameter estimation in the problem, therefore our algorithm achieves the best regret bounds in the LASSO bandit, better bounds in the low-rank matrix bandit and the group sparse matrix bandit, as well as a novel bound in the multi-agent LASSO bandit.
翻译:低维结构的高维土匪问题在诸如在线广告和药物发现等不同应用中非常有用。 在这项工作中,我们建议为这类问题提供一个简单的统一算法,并为我们的算法的末端提出一个总体分析框架。我们表明,根据一些温和的统一假设,我们的算法可以适用于不同的高维土匪问题。我们的框架利用低维结构来指导问题中的参数估计,因此我们的算法在LASSO土匪中达到了最好的遗憾界限,在低级矩阵土匪和群体稀少的矩阵土匪中得到了更好的界限,并在多试剂LASSO土匪中得到了新的界限。