Federated learning is the centralized training of statistical models from decentralized data on mobile devices while preserving the privacy of each device. We present a robust aggregation approach to make federated learning robust to settings when a fraction of the devices may be sending corrupted updates to the server. The approach relies on a robust aggregation oracle based on the geometric median, which returns a robust aggregate using a constant number of iterations of a regular non-robust averaging oracle. The robust aggregation oracle is privacy-preserving, similar to the non-robust secure average oracle it builds upon. We establish its convergence for least squares estimation of additive models. We provide experimental results with linear models and deep networks for three tasks in computer vision and natural language processing. The robust aggregation approach is agnostic to the level of corruption; it outperforms the classical aggregation approach in terms of robustness when the level of corruption is high, while being competitive in the regime of low corruption. Two variants, a faster one with one-step robust aggregation and another one with on-device personalization, round off the paper.
翻译:联邦学习是利用移动设备上分散的数据对统计模型进行集中培训,同时保护每个装置的隐私。我们提出了一个强有力的汇总方法,使联盟学习更加稳健,在部分装置可能向服务器发送腐败更新时,这种方法依赖于基于几何中位数的稳健集合点,该中位数使用固定的非紫色平均黑桃的迭代法返回一个稳健的集合点,而稳健的集合点是隐私保护,类似于它所依靠的非紫色安全平均值或触角。我们为添加模型的最小平方估计建立了趋同点。我们为计算机视觉和自然语言处理方面的三项任务提供了线性模型和深网络的实验结果。稳健的汇总方法在腐败程度高时超过了典型汇总方法的稳健度,同时在低腐败制度下具有竞争力。两种变式是更快的,一种是一步稳健的集合,另一种是纸上圆绕的、一步式的个人化。