In federated learning (FL) with top $r$ sparsification, millions of users collectively train a machine learning (ML) model locally, using their personal data by only communicating the most significant $r$ fraction of updates to reduce the communication cost. It has been shown that the values as well as the indices of these selected (sparse) updates leak information about the users' personal data. In this work, we investigate different methods to carry out user-database communications in FL with top $r$ sparsification efficiently, while guaranteeing information theoretic privacy of users' personal data. These methods incur considerable storage cost. As a solution, we present two schemes with different properties that use MDS coded storage along with a model segmentation mechanism to reduce the storage cost at the expense of a controllable amount of information leakage, to perform private FL with top $r$ sparsification.
翻译:在联合学习(FL)中,数以百万计的用户共同在当地培训机器学习模型,使用个人数据,只提供最重要的更新部分,以降低通信成本。已经表明,这些选定的(粗略的)数值和指数更新了关于用户个人数据的信息泄漏信息。在这项工作中,我们调查了以最高额美元快速化方式在FL进行用户数据库通信的不同方法,同时保证了用户个人数据的信息理论保密性。这些方法需要相当高的存储费用。作为一种解决办法,我们提出了两种具有不同特性的计划,即使用MDRS编码存储器以及一个模型分割机制来降低存储成本,以牺牲可控制的信息泄漏量,用最高额美元快速化进行私人FL。