As people pay more and more attention to privacy protection, Federated Learning (FL), as a promising distributed machine learning paradigm, is receiving more and more attention. However, due to the biased distribution of data on devices in real life, federated learning has lower classification accuracy than traditional machine learning in Non-IID scenarios. Although there are many optimization algorithms, the local model aggregation in the parameter server is still relatively traditional. In this paper, a new algorithm FedPDC is proposed to optimize the aggregation mode of local models and the loss function of local training by using the shared data sets in some industries. In many benchmark experiments, FedPDC can effectively improve the accuracy of the global model in the case of extremely unbalanced data distribution, while ensuring the privacy of the client data. At the same time, the accuracy improvement of FedPDC does not bring additional communication costs.
翻译:由于人们越来越重视隐私保护,联邦学习联合会(FL)作为一个有希望的分布式机器学习模式,正日益受到越来越多的关注,然而,由于实际生活中设备数据分布偏差,在非IID情景中,联邦学习的分类准确性低于传统的机器学习。虽然有许多优化算法,但参数服务器中的本地模型汇总仍然相对传统。在本文件中,提议采用新的算法FedPDC, 以优化本地模型集成模式和本地培训的流失功能,在某些行业使用共享数据集。在许多基准实验中,美分会可以有效提高全球模型的准确性,在数据分配极不平衡的情况下,同时确保客户数据的隐私。与此同时,美分会的准确性提高不会带来额外的通信成本。</s>