FFV: 对垂直分割数据进行保护隐私的联邦学习 (FedV: Privacy-Preserving Federated Learning over Vertically Partitioned Data)

Federated learning (FL) has been proposed to allow collaborative training of machine learning (ML) models among multiple parties where each party can keep its data private. In this paradigm, only model updates, such as model weights or gradients, are shared. Many existing approaches have focused on horizontal FL, where each party has the entire feature set and labels in the training data set. However, many real scenarios follow a vertically-partitioned FL setup, where a complete feature set is formed only when all the datasets from the parties are combined, and the labels are only available to a single party. Privacy-preserving vertical FL is challenging because complete sets of labels and features are not owned by one entity. Existing approaches for vertical FL require multiple peer-to-peer communications among parties, leading to lengthy training times, and are restricted to (approximated) linear models and just two parties. To close this gap, we propose FedV, a framework for secure gradient computation in vertical settings for several widely used ML models such as linear models, logistic regression, and support vector machines. FedV removes the need for peer-to-peer communication among parties by using functional encryption schemes; this allows FedV to achieve faster training times. It also works for larger and changing sets of parties. We empirically demonstrate the applicability for multiple types of ML models and show a reduction of 10%-70% of training time and 80% to 90% in data transfer with respect to the state-of-the-art approaches.

翻译：联邦学习(FL)建议允许多方合作培训机器学习模式(ML),每个方均可将数据保密。在这一模式中,只有模型更新模式,如模型重量或梯度等,才能共享。许多现有做法侧重于横向FL,每个方在培训数据集中都有全部特征集和标签。然而,许多真实情景都遵循垂直分割的FL设置,只有将来自各方的所有数据集合并起来,而且标签只提供给单一方,才能形成完整的功能集成。隐私保护垂直FL具有挑战性,因为完整成套标签和特征并非由一个实体拥有。关于纵向FL的现有做法要求缔约方进行多对等通信,导致培训时间过长,并限于(近似)线性FL模型和仅两个方。为了缩小这一差距,我们提议FedV,一个用于在垂直环境中安全计算梯度模型的框架,如线性模型、物流回归,以及支持矢量机器。FedV删除了对全套标签和全套标签应用性特征的成套适用性标签和特征,因为全套标签和特征的标签和特征并非由一个实体拥有。缔约方拥有。现有的全套全套标签和全套标签,需要多对等对等对等的通信的通信进行多对等通信通信的通信的通信的通信的通信的通信,需要,这需要,导致双方之间进行更快速的多式数据转换式的进度式的进度式的进度式数据转换,从而可以进行更快式的进度式的进度,从而进行更快式的进度式的进度式的进度式的传输,从而显示为10个格式的进度转换的进度式的进度式的进度转换为10的进度式的进度,使缔约方之间进行更快式的进度式的进度式的进度式的进度式的进度式的进度,为10的进度制制的进度制。我们为10的进度制,为10的进度制制的进度制,为10的进度制的进度制的进度制的进度制的进度制的进度制的进度。我们为40。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/