Federated learning (FL) has achieved great success as a privacy-preserving distributed training paradigm, where many edge devices collaboratively train a machine learning model by sharing the model updates instead of the raw data with a server. However, the heterogeneous computational and communication resources of edge devices give rise to stragglers that significantly decelerate the training process. To mitigate this issue, we propose a novel FL framework named stochastic coded federated learning (SCFL) that leverages coded computing techniques. In SCFL, before the training process starts, each edge device uploads a privacy-preserving coded dataset to the server, which is generated by adding Gaussian noise to the projected local dataset. During training, the server computes gradients on the global coded dataset to compensate for the missing model updates of the straggling devices. We design a gradient aggregation scheme to ensure that the aggregated model update is an unbiased estimate of the desired global update. Moreover, this aggregation scheme enables periodical model averaging to improve the training efficiency. We characterize the tradeoff between the convergence performance and privacy guarantee of SCFL. In particular, a more noisy coded dataset provides stronger privacy protection for edge devices but results in learning performance degradation. We further develop a contract-based incentive mechanism to coordinate such a conflict. The simulation results show that SCFL learns a better model within the given time and achieves a better privacy-performance tradeoff than the baseline methods. In addition, the proposed incentive mechanism grants better training performance than the conventional Stackelberg game approach.
翻译:联邦学习(FL)作为一种隐私保护分布式培训范例取得了巨大成功,许多边缘装置通过共享模型更新而不是与服务器共享原始数据,合作培训机器学习模式。然而,边缘装置的各种计算和通信资源导致分解器变换,大大减慢了培训进程。为缓解这一问题,我们提议了一个名为随机编码化的联邦学习(SCFL)的新FL框架,利用编码化计算技术。在SSCFL,在培训进程开始之前,每个边缘装置都向服务器上传了一个保存隐私的编码数据集,这是通过在预测的当地数据集中添加高斯语噪音产生的。但是,在培训期间,服务器在全球编码化数据集中计算梯度,以弥补缺失的螺旋式更新设备。我们设计了一个梯度汇总计划,以确保汇总模型更新是对所要提出的全球最新格式的公正估计。此外,这一汇总计划使得定期模型可以平均提高培训效率。我们把精度在SSCFLL的合并性能和隐私保障之间的折叠合,这是通过在预计的本地数据集集中添加的噪音噪音噪音噪音噪音噪音噪音噪音噪音噪音噪音噪音产生的。我们更有力地学习了一种更精确的升级机制。我们学习了一种改进了SLFLElide d dal dal dal drodu化机制。