Federated learning is an emerging technique for training deep models over decentralized clients without exposing private data, which however suffers from label distribution skew and usually results in slow convergence and degraded model performance. This challenge could be more serious when the participating clients are in unstable circumstances and dropout frequently. Previous work and our empirical observations demonstrate that the classifier head for classification task is more sensitive to label skew and the unstable performance of FedAvg mainly lies in the imbalanced training samples across different classes. The biased classifier head will also impact the learning of feature representations. Therefore, maintaining a balanced classifier head is of significant importance for building a better global model. To tackle this issue, we propose a simple yet effective framework by introducing a prior-calibrated softmax function for computing the cross-entropy loss and a prototype-based feature augmentation scheme to re-balance the local training, which are lightweight for edge devices and can facilitate the global model aggregation. With extensive experiments performed on FashionMNIST and CIFAR-10 datasets, we demonstrate the improved model performance of our method over existing baselines in the presence of non-IID data and client dropout.
翻译:联邦学习是一种新兴技术,用于在不披露私人数据的情况下,对分散客户进行深层次模型培训,但不披露私营数据,然而,这种技术受到标签分布偏差的影响,通常导致缓慢趋同和降低模型性能。当参与的客户处于不稳定环境和经常辍学时,这一挑战可能更为严峻。以前的工作和我们的经验性观察表明,负责分类任务的分类负责人对标签偏差和FedAvg的不稳定性表现更加敏感,主要在于不同类别的培训样本不平衡。有偏见的分类头也会影响特征表现的学习。因此,保持平衡的分类头对于建立一个更好的全球模型非常重要。为了解决这一问题,我们提出了一个简单而有效的框架,即采用事先校准的软体积功能来计算跨热带损失,以及一个基于原型特性增强计划来重新平衡当地培训,因为当地培训对边缘设备来说是轻度的,能够促进全球模型汇总。在FashionMNIST和CIFAR-10数据集进行的广泛实验后,我们展示了在非IID数据和客户辍学情况下,我们的方法比现有基线的示范性表现得到改善。</s>