Federated Learning (FL) is a privacy preserving machine learning scheme, where training happens with data federated across devices and not leaving them to sustain user privacy. This is ensured by making the untrained or partially trained models to reach directly the individual devices and getting locally trained "on-device" using the device owned data, and the server aggregating all the partially trained model learnings to update a global model. Although almost all the model learning schemes in the federated learning setup use gradient descent, there are certain characteristic differences brought about by the non-IID nature of the data availability, that affects the training in comparison to the centralized schemes. In this paper, we discuss the various factors that affect the federated learning training, because of the non-IID distributed nature of the data, as well as the inherent differences in the federating learning approach as against the typical centralized gradient descent techniques. We empirically demonstrate the effect of number of samples per device and the distribution of output labels on federated learning. In addition to the privacy advantage we seek through federated learning, we also study if there is a cost advantage while using federated learning frameworks. We show that federated learning does have an advantage in cost when the model sizes to be trained are not reasonably large. All in all, we present the need for careful design of model for both performance and cost.
翻译:联邦学习联合会(FL)是一个保护隐私的机器学习计划,在这种计划中,培训与跨设备的数据结合在一起,而不是让它们维持用户的隐私。通过使未经培训或部分培训的模型直接接触单个设备,利用设备自有数据在当地获得经过培训的“在线设备”以及利用设备自有数据在当地获得经过培训的“在线设备”以及将所有经过部分培训的模型学习汇集在一起的服务器来更新一个全球模型,确保了这一点。尽管联邦学习联合会(FLF)中几乎所有的模型学习计划都使用梯度下降,但数据提供的非IID性质带来的某些特点差异影响了与集中方案相比的培训。在本文中,我们讨论了影响联合学习培训培训的各种因素,因为数据是非IID分布的,而使用设备自有的“在线设备”分布式的“在线设备”,以及将所有部分经过培训的学习方法与典型的中央化的梯度下降技术存在内在差异。我们通过实验性地展示了每个设备样本数量和产出标签对联邦学习的影响。除了我们通过联合学习寻求的隐私优势之外,我们还研究在使用联合学习框架时,如果具有成本优势,那么,那么,我们也研究是否具有成本优势,那么,那么,那么,我们,我们就会研究。 我们的优势。 我们的学习框架, 展示了一种成本,我们就是在合理的设计成本。 我们的优势。