Federated learning is a data decentralization privacy-preserving technique used to perform machine or deep learning in a secure way. In this paper we present theoretical aspects about federated learning, such as the presentation of an aggregation operator, different types of federated learning, and issues to be taken into account in relation to the distribution of data from the clients, together with the exhaustive analysis of a use case where the number of clients varies. Specifically, a use case of medical image analysis is proposed, using chest X-Ray images obtained from an open data repository. In addition to the advantages related to privacy, improvements in predictions (in terms of accuracy, loss and area under the curve) and reduction of execution times will be studied with respect to the classical case (the centralized approach). Different clients will be simulated from the training data, selected in an unbalanced manner. The results of considering three or ten clients are exposed and compared between them and against the centralized case. Two different problems related to intermittent clients are discussed, together with two approaches to be followed for each of them. Specifically, this type of problems may occur because in a real scenario some clients may leave the training, and others enter it, and on the other hand because of client technical or connectivity problems. Finally, improvements and future work in the field are proposed.
翻译:联邦学习是一种数据权力下放的隐私保护技术,用于以安全的方式进行机器或深层学习。在本文件中,我们介绍了关于联合学习的理论方面,例如,一个综合操作员的介绍、不同类型的联合学习,以及在客户数据分配方面需要考虑的问题,同时对客户数量不同的一个使用案例进行了详尽的分析;具体地说,建议使用一个医疗图像分析案例,使用从开放数据储存处获得的胸前X光图像;除了隐私、预测改进(准确性、损失和曲线下领域)以及缩短执行时间方面的优势外,还将研究古典案例(集中方法)方面的优势;将模拟从培训数据中挑选的不同客户,以不平衡的方式挑选;考虑3个或10个客户的结果,在客户之间和集中情况下进行比较;讨论了与间歇客户有关的两个不同的问题,每个客户都要遵循两种方法;具体地说,这类问题可能出现,因为一些客户可能离开培训,而其他客户进入了培训领域或未来工作。最后,是因为技术上的连通性问题,因为其他客户进入了外地。