We study the problem of best arm identification in linear bandits in the fixed-budget setting. By leveraging properties of the G-optimal design and incorporating it into the arm allocation rule, we design a parameter-free algorithm, Optimal Design-based Linear Best Arm Identification (OD-LinBAI). We provide a theoretical analysis of the failure probability of OD-LinBAI. While the performances of existing methods (e.g., BayesGap) depend on all the optimality gaps, OD-LinBAI depends on the gaps of the top $d$ arms, where $d$ is the effective dimension of the linear bandit instance. Furthermore, we present a minimax lower bound for this problem. The upper and lower bounds show that OD-LinBAI is minimax optimal up to multiplicative factors in the exponent. Finally, numerical experiments corroborate our theoretical findings.
翻译:我们研究了在固定预算环境中线性土匪中最佳武器识别问题。我们利用G-最佳设计特性并将其纳入武器分配规则,设计了一个无参数算法,即基于最佳设计的最佳线性武器识别(OD-LinBAI)。我们对OD-LinBAI的失败概率进行了理论分析。虽然现有方法(例如BayesGap)的性能取决于所有最佳性差,但OD-LinBAI取决于顶端的美元武器的差距,而美元是线性土匪实例的有效维度。此外,我们为这一问题提出了一条小号,小号为这一问题设定了下界。上下界显示OD-LinBAI的微轴最优性能与引量的多倍性因素。最后,数字实验证实了我们的理论结论。