Collaborative machine learning algorithms are developed both for efficiency reasons, and to ensure the privacy protection of sensitive data used for processing. Federated learning is the most popular of these methods, where 1) learning is done locally, and 2) only a subset of the participants contribute in each training round. Despite individual data is not shared explicitly, recent studies showed that federated learning models could still leak information. In this paper we focus on the quality of individual training datasets, and show that such information could be inferred and connected to specific participants even when secure aggregation is applied. Specifically, we use three simple scoring rules for evaluating per round aggregated updates in the federated learning process, and mount a novel differential quality inference attack (i.e., relative quality ordering reconstruction). Through a series of image recognition experiments we show that the attack is able to infer the relative quality ordering of participants. Whilst an attack in the traditional sense, quality inference could also improve the federated learning process: we demonstrate how it can be used to (i) boost training efficiency and (ii) detect misbehavior. Finally, as a system designer might want to alleviate quality inference in certain use-cases, we discuss mitigation approaches.
翻译:为了提高效率,并为了确保用于处理的敏感数据的隐私保护,开发了合作机器学习算法,既是为了提高效率,也是为了确保用于处理的敏感数据的隐私保护。联邦学习是这些方法中最受欢迎的方法,其中(1) 学习是在当地进行的,(2) 只有一部分参与者在每一轮培训中作出贡献。尽管没有明确地分享个人数据,但最近的研究表明,联合学习模式仍然可以泄漏信息。在本文件中,我们侧重于个人培训数据集的质量,并表明即使在应用安全汇总的情况下,这种信息也可以被推断和连接到具体的参与者。具体地说,我们使用三个简单的评分规则来评价联合学习过程中的每轮综合更新,并进行新的质量差异推论攻击(即相对质量的重组)。我们通过一系列图像识别实验表明,袭击能够推断参与者的相对质量排序。虽然从传统意义上讲,质量推论也可以改进联合学习过程:我们演示如何使用这些信息来(一)提高培训效率,和(二)检测错误的导航。最后,作为系统设计者可能希望在某些情况下降低质量的减少使用率。