Federated learning allows multiple parties to collaboratively train a joint model without sharing local data. This enables applications of machine learning in settings of inherently distributed, undisclosable data such as in the medical domain. In practice, joint training is usually achieved by aggregating local models, for which local training objectives have to be in expectation similar to the joint (global) objective. Often, however, local datasets are so small that local objectives differ greatly from the global objective, resulting in federated learning to fail. We propose a novel approach that intertwines model aggregations with permutations of local models. The permutations expose each local model to a daisy chain of local datasets resulting in more efficient training in data-sparse domains. This enables training on extremely small local datasets, such as patient data across hospitals, while retaining the training efficiency and privacy benefits of federated learning.
翻译:联邦学习允许多个方面合作培训联合模型,但不分享当地数据。这样可以让机器学习在医疗领域等内在分布的、不可披露的数据环境中应用。在实践中,联合培训通常是通过汇集当地模型来实现的,而当地培训目标必须与联合(全球)目标相似。然而,地方数据集往往太小,以致地方目标与全球目标大相径庭,导致联合学习失败。我们提出了一种新颖的办法,将模型聚合与本地模型互换。这种调整使每个地方模型都暴露于一个本地数据集的时空链中,从而在数据扭曲领域进行更有效的培训。这使得培训能够进行极小的当地数据集,如医院之间的病人数据,同时保留了联邦学习的培训效率和隐私效益。