多武装强力联邦最佳武器识别多武装匪徒 (Robust Federated Best-Arm Identification in Multi-Armed Bandits)

We study a federated variant of the best-arm identification problem in stochastic multi-armed bandits: a set of clients, each of whom can sample only a subset of the arms, collaborate via a server to identify the best arm (i.e., the arm with the highest mean reward) with prescribed confidence. For this problem, we propose Fed-SEL, a simple communication-efficient algorithm that builds on successive elimination techniques and involves local sampling steps at the clients. To study the performance of Fed-SEL, we introduce a notion of arm-heterogeneity that captures the level of dissimilarity between distributions of arms corresponding to different clients. Interestingly, our analysis reveals the benefits of arm-heterogeneity in reducing both the sample- and communication-complexity of Fed-SEL. As a special case of our analysis, we show that for certain heterogeneous problem instances, Fed-SEL outputs the best-arm after just one round of communication. Our findings have the following key implication: unlike federated supervised learning where recent work has shown that statistical heterogeneity can lead to poor performance, one can provably reap the benefits of both local computation and heterogeneity for federated best-arm identification. As our final contribution, we develop variants of Fed-SEL, both for federated and peer-to-peer settings, that are robust to the presence of Byzantine clients, and hence suitable for deployment in harsh, adversarial environments.

翻译：对于这一问题,我们建议Fed-SEL是一种简单的通信效率算法,以连续消除技术为基础,并在客户中采用地方抽样步骤。为了研究Fed-SEL的绩效,我们引入了一种手臂偏差的观念,它捕捉到不同客户武器分布不均的差别程度。有趣的是,我们的分析揭示了在降低美联储的样本和通信兼容性方面,通过服务器合作找到最佳手臂(即拥有最高平均报酬的手臂)的好处。作为我们分析的一个特例,我们表明在某些复杂问题上,美联储-SEL在一次通信后就会产生最佳的手臂。我们的调查结果具有以下关键含义:与联邦-SEL的监管性学习不同,最近的工作显示,统计偏差可以导致低效表现,一种适合的联邦-SER-SEL,以及我们联邦-SER-SE-FAF-Feral-Feral-Feral-Feral-Feral-Feral-Feral-Feral-Feral-Feral-Feral-Seral-Serveral-Acal-Acal-Seral-Servical-lation-Servication-lation-lation-lation-Servication-lational-lation-lation-lational-lation-lation-I)和Ferent-Serent-Serverent-Serverent-s-I-I-I-slation-Serverent-Server-slation-slation-slation-lation-s-s-lation-lation-lation-lation-lation-lation-lent-s-s-s-s-s-s-s-I)-s-slent-slent-Id-s-s-s-I-slation-lation-lation-lation-lation-slation-lation-lation-lation-s-lation-lation-lation-lation-lation-s-I-I-s-s-I)-s-I-I-I-I-I-I-I-I-I-I-I-I-I-I