Stochastic high dimensional bandit problems with low dimensional structures are useful in different applications such as online advertising and drug discovery. In this work, we propose a simple unified algorithm for such problems and present a general analysis framework for the regret upper bound of our algorithm. We show that under some mild unified assumptions, our algorithm can be applied to different high dimensional bandit problems. Our framework utilizes the low dimensional structure to guide the parameter estimation in the problem, therefore our algorithm achieves the best regret bounds in the LASSO bandit, as well as novel bounds in the low-rank matrix bandit, the group sparse matrix bandit, and in a new problem: the multi-agent LASSO bandit.
翻译:在网上广告和毒品发现等不同应用中,低维结构的高维土匪问题对诸如网上广告和药物发现等不同应用是有用的。在这项工作中,我们建议为这类问题提供一个简单的统一算法,并为我们的算法的末端遗憾地提出一个总体分析框架。我们表明,根据一些温和的统一假设,我们的算法可以适用于不同的高维土匪问题。我们的框架利用低维结构来指导问题中的参数估计,因此我们的算法在LASSO土匪中取得了最好的遗憾界限,以及在低级矩阵土匪、群体稀少的矩阵土匪和一个新问题中,即多剂LASSO土匪中取得了最新的界限。