Federated learning brings potential benefits of faster learning, better solutions, and a greater propensity to transfer when heterogeneous data from different parties increases diversity. However, because federated learning tasks tend to be large and complex, and training times non-negligible, it is important for the aggregation algorithm to be robust to non-IID data and corrupted parties. This robustness relies on the ability to identify, and appropriately weight, incompatible parties. Recent work assumes that a \textit{reference dataset} is available through which to perform the identification. We consider settings where no such reference dataset is available; rather, the quality and suitability of the parties needs to be \textit{inferred}. We do so by bringing ideas from crowdsourced predictions and collaborative filtering, where one must infer an unknown ground truth given proposals from participants with unknown quality. We propose novel federated learning aggregation algorithms based on Bayesian inference that adapt to the quality of the parties. Empirically, we show that the algorithms outperform standard and robust aggregation in federated learning on both synthetic and real data.
翻译:联邦学习带来更快学习的潜在好处,更好的解决方案,以及当来自不同党派的不同数据增加多样性时,更倾向于转移不同数据。然而,由于联合学习任务往往规模大而复杂,培训时间也不可忽略,因此,总算算法对于非国际开发公司的数据和腐败方而言必须强有力。这种稳健性取决于识别不相容的各方的能力和适当的权重。最近的工作假设,可以通过提供\ textit{reference datasets来进行识别。我们考虑到没有这种参考数据集的设置;相反,缔约方的质量和适宜性必须是\ textit{inferred}。我们这样做的方法是,从众联的预测和协作过滤中提出一些想法,其中必须推断出来自质量不明的参与者建议的未知的地面真相。我们提出了基于巴伊西亚推论的新的联合学习算法,以适应缔约方的质量为基础。我们很生动地表明,这些算法在合成和真实数据的联邦学习中超越了标准和稳健的汇总。