Shapley Value is commonly adopted to measure and incentivize client participation in federated learning. In this paper, we show -- theoretically and through simulations -- that Shapley Value underestimates the contribution of a common type of client: the Maverick. Mavericks are clients that differ both in data distribution and data quantity and can be the sole owners of certain types of data. Selecting the right clients at the right moment is important for federated learning to reduce convergence times and improve accuracy. We propose FedEMD, an adaptive client selection strategy based on the Wasserstein distance between the local and global data distributions. As FedEMD adapts the selection probability such that Mavericks are preferably selected when the model benefits from improvement on rare classes, it consistently ensures the fast convergence in the presence of different types of Mavericks. Compared to existing strategies, including Shapley Value-based ones, FedEMD improves the convergence of neural network classifiers by at least 26.9% for FedAvg aggregation compared with the state of the art.
翻译:光谱值通常用于衡量和激励客户参与联合学习。在本文中,我们从理论上和模拟中显示 -- -- 从理论上和通过模拟 -- -- Shapley值低估了常见客户类型的贡献:马弗瑞克。马弗瑞克是数据分配和数据数量各不相同的客户,也可以是某些类型数据的独家所有者。在正确的时刻选择正确的客户对于联邦学习减少趋同时间和提高准确性很重要。我们提议FedEMD, 一种适应性客户选择战略,以本地和全球数据分布之间的瓦西斯坦距离为基础。随着FedEMD调整选择概率,当模型在稀有类别上得到改进时,马弗瑞克人最好被选中。与现有的战略相比,包括Shapley价值基战略相比,FedAvg聚合的神经网络分类器比艺术状态至少增加26.9 %。