Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private. This decentralized approach is prone to suffer the consequences of data statistical heterogeneity, both across the different entities and over time, which may lead to a lack of convergence. To avoid such issues, different methods have been proposed in the past few years. However, data may be heterogeneous in lots of different ways, and current proposals do not always determine the kind of heterogeneity they are considering. In this work, we formally classify data statistical heterogeneity and review the most remarkable learning strategies that are able to face it. At the same time, we introduce approaches from other machine learning frameworks, such as Continual Learning, that also deal with data heterogeneity and could be easily adapted to the Federated Learning settings.
翻译:联邦学习联盟是一个新颖的框架,它允许多种设备或机构在保持数据私密的同时合作培训机器学习模式,这种分散化的做法容易受到不同实体和时间上的数据统计异质性的影响,这可能导致缺乏趋同性;为避免这些问题,过去几年提出了不同的方法;然而,数据可能有许多不同的方式,而目前的建议并不总是决定它们考虑的异质性。在这项工作中,我们正式对统计数据异质性进行了分类,并审查了能够面对这些数据的最显著的学习战略。与此同时,我们从其他机器学习框架(如持续学习)引入了方法,这些方法也涉及数据异质性,并且可以很容易地适应联邦学习环境。