Federated learning is a distributed machine learning paradigm that trains a global model for prediction based on a number of local models at clients while local data privacy is preserved. Class imbalance is believed to be one of the factors that degrades the global model performance. However, there has been very little research on if and how class imbalance can affect the global performance. class imbalance in federated learning is much more complex than that in traditional non-distributed machine learning, due to different class imbalance situations at local clients. Class imbalance needs to be re-defined in distributed learning environments. In this paper, first, we propose two new metrics to define class imbalance -- the global class imbalance degree (MID) and the local difference of class imbalance among clients (WCS). Then, we conduct extensive experiments to analyze the impact of class imbalance on the global performance in various scenarios based on our definition. Our results show that a higher MID and a larger WCS degrade more the performance of the global model. Besides, WCS is shown to slow down the convergence of the global model by misdirecting the optimization.
翻译:联邦学习是一种分布式的机器学习模式,它基于若干当地模式,在客户中培训一种全球预测模型,同时保护当地的数据隐私。类不平衡被认为是降低全球模式绩效的因素之一。然而,对于类不平衡是否以及如何影响全球绩效的问题,研究很少。由于地方客户中不同等级的不平衡状况,联合会学习中的类不平衡比传统非分配型机器学习更为复杂。分类不平衡需要在分布式学习环境中重新界定。首先,我们提出两个新的指标来界定阶级不平衡 -- -- 全球级不平衡程度(MID)和客户之间等级不平衡的地方差异(WCS) -- -- 然后,我们根据我们的定义,进行广泛的实验,分析各种情景中阶级不平衡对全球绩效的影响。我们的结果显示,更高的MID和更大的 WCS使全球模式的绩效更加退化。此外,WCS显示,错误优化会减缓全球模式的趋同速度。