Federated learning encapsulates distributed learning strategies that are managed by a central unit. Since it relies on using a selected number of agents at each iteration, and since each agent, in turn, taps into its local data, it is only natural to study optimal sampling policies for selecting agents and their data in federated learning implementations. Usually, only uniform sampling schemes are used. However, in this work, we examine the effect of importance sampling and devise schemes for sampling agents and data non-uniformly guided by a performance measure. We find that in schemes involving sampling without replacement, the performance of the resulting architecture is controlled by two factors related to data variability at each agent, and model variability across agents. We illustrate the theoretical findings with experiments on simulated and real data and show the improvement in performance that results from the proposed strategies.
翻译:联邦学习包装由中央单位管理,分布式学习战略由中央单位管理,由于它依赖在每次迭代中使用选定数量的代理商,而且由于每个代理商反过来利用当地数据,因此研究最佳抽样政策以选择代理商及其在联合学习实施过程中的数据是自然的,通常只采用统一的抽样办法,然而,在这项工作中,我们审查取样剂和数据抽样剂重要取样和设计计划的影响以及不以业绩衡量标准为统一指导的数据。我们发现,在涉及抽样而不替换的方案中,所产生的结构的性能受到与每个代理商数据变化有关的两个因素的控制,以及各种代理商之间的模型变化。我们用模拟和真实数据的实验来说明理论结论,并表明拟议战略的绩效的改进。