Stochastic sparse linear bandits offer a practical model for high-dimensional online decision-making problems and have a rich information-regret structure. In this work we explore the use of information-directed sampling (IDS), which naturally balances the information-regret trade-off. We develop a class of information-theoretic Bayesian regret bounds that nearly match existing lower bounds on a variety of problem instances, demonstrating the adaptivity of IDS. To efficiently implement sparse IDS, we propose an empirical Bayesian approach for sparse posterior sampling using a spike-and-slab Gaussian-Laplace prior. Numerical results demonstrate significant regret reductions by sparse IDS relative to several baselines.
翻译:浅浅的线性土匪为高维在线决策问题提供了一个实用模式,并拥有丰富的信息-区域结构。在这项工作中,我们探索使用信息导向抽样(IDS),这自然平衡了信息—区域交易的平衡。我们开发了一组信息理论-贝叶斯式的遗憾界限,它几乎与各种问题实例的现有较低界限相匹配,显示了国际数据传输系统的适应性。为了有效执行稀有的国际数据传输系统,我们建议采用经验性的巴耶斯式方法,利用先前的悬浮和悬浮Gaussian-Laplace方法,对稀疏的后方取样进行实证。 数字结果显示,与若干基线相比,稀有的国际数据传输系统显著减少了遗憾。