This paper presents and characterizes an Open Application Repository for Federated Learning (OARF), a benchmark suite for federated machine learning systems. Previously available benchmarks for federated learning have focused mainly on synthetic datasets and use a limited number of applications. OARF mimics more realistic application scenarios with publicly available data sets as different data silos in image, text and structured data. Our characterization shows that the benchmark suite is diverse in data size, distribution, feature distribution and learning task complexity. The extensive evaluations with reference implementations show the future research opportunities for important aspects of federated learning systems. We have developed reference implementations, and evaluated the important aspects of federated learning, including model accuracy, communication cost, throughput and convergence time. Through these evaluations, we discovered some interesting findings such as federated learning can effectively increase end-to-end throughput.
翻译:本文介绍并介绍了联邦学习开放应用存储库(OARF),这是联邦学习系统的基准套件,以前联邦学习的基准主要侧重于合成数据集,使用数量有限的应用程序。OARF将更现实的应用设想方案与公开可得的数据集相仿,在图像、文本和结构化数据方面采用不同的数据筒状。我们的特征描述表明,基准套件在数据大小、分布、特征分布和学习任务复杂程度方面各不相同。与参考实施有关的广泛评价显示了联邦学习系统重要方面的未来研究机会。我们开发了参考实施,并评估了联邦学习的重要方面,包括模型精度、通信成本、吞并时间和聚合时间。通过这些评价,我们发现了一些有趣的发现,例如联邦学习能够有效地增加终端到终端的吞吐量。