Federated learning is a popular paradigm for machine learning. Ideally, federated learning works best when all clients share a similar data distribution. However, it is not always the case in the real world. Therefore, the topic of federated learning on heterogeneous data has gained more and more effort from both academia and industry. In this project, we first do extensive experiments to show how data skew and quantity skew will affect the performance of state-of-art federated learning algorithms. Then we propose a new algorithm FedMix which adjusts existing federated learning algorithms and we show its performance. We find that existing state-of-art algorithms such as FedProx and FedNova do not have a significant improvement in all testing cases. But by testing the existing and new algorithms, it seems that tweaking the client side is more effective than tweaking the server side.
翻译:联邦学习是机器学习的流行范例。 理想的情况是, 在所有客户共享类似数据分布的情况下, 联邦学习最有效。 但是, 在现实世界中并不总是这样。 因此, 学术界和产业界都对联邦学习不同数据的问题做出了越来越多的努力。 在这个项目中, 我们首先进行广泛的实验, 以显示数据扭曲和数量扭曲将如何影响最先进的联邦学习算法的运行。 然后我们提出一个新的算法 FedMix, 来调整现有的联邦学习算法, 并展示其表现。 我们发现现有最先进的算法, 如 FedProx 和 FedNova 在所有测试案例中都没有显著的改进。 但是通过测试现有的和新的算法, 我们似乎可以比对服务器进行节制更有效。