联邦学习安全加权汇总 (Secure Weighted Aggregation for Federated Learning)

The pervasive adoption of Internet-connected digital services has led to a growing concern in the personal data privacy of their customers. On the other hand, machine learning (ML) techniques have been widely adopted by digital service providers to improve operational productivity and customer satisfaction. ML inevitably accesses and processes users' personal data, which could potentially breach the relevant privacy protection regulations if not performed carefully. The situation is exacerbated by the cloud-based implementation of digital services when user data are captured and stored in distributed locations, hence aggregation of the user data for ML could be a serious breach of privacy regulations. In this backdrop, Federated Learning (FL) is an emerging area that allows ML on distributed data without the data leaving their stored location. However, depending on the nature of the digital services, data captured at different locations may carry different significance to the business operation, hence a weighted aggregation will be highly desirable for enhancing the quality of the FL-learned model. Furthermore, to prevent leakage of user data from the aggregated gradients, cryptographic mechanisms are needed to allow secure aggregation of FL. In this paper, we propose a privacy-enhanced FL scheme for supporting secure weighted aggregation. Besides, by devising a verification protocol based on Zero-Knowledge Proof (ZKP), the proposed scheme is capable of guarding against fraudulent messages from FL participants. Experimental results show that our scheme is practical and secure. Compared to existing FL approaches, our scheme achieves secure weighted aggregation with an additional security guarantee against fraudulent messages with an affordable 1.2 times runtime overheads and 1.3 times communication costs.

翻译：互联网上的数字服务被普遍采用,导致客户个人数据隐私日益受到越来越多的关注;另一方面,数字服务提供商广泛采用机器学习技术,以提高业务生产率和客户满意度;ML不可避免地访问和处理用户个人数据,如果不认真执行,可能会违反相关的隐私保护条例;在用户数据被收集和储存在分布地点时,基于云实施数字服务使情况更加恶化,因此将ML用户数据汇总起来可能严重违反隐私条例;在这一背景下,Freed Learning(FL)是一个新兴领域,允许ML在不离开其储存地点的数据的情况下使用已分发的数据;然而,根据数字服务的性质,在不同地点获取的数据可能对业务产生不同的意义,因此,对提高FL学习模式的质量极有必要进行加权汇总;此外,为了防止用户数据从汇总的梯度渗漏,需要加密机制使FL能够安全地汇总。在本文中,我们提议采用隐私强化的FL计划计划计划,用以支持安全地使用FIL系统, 并用安全性ARC系统进行升级。