In recent years, data are typically distributed in multiple organizations while the data security is becoming increasingly important. Federated Learning (FL), which enables multiple parties to collaboratively train a model without exchanging the raw data, has attracted more and more attention. Based on the distribution of data, FL can be realized in three scenarios, i.e., horizontal, vertical, and hybrid. In this paper, we propose to combine distributed machine learning techniques with Vertical FL and propose a Distributed Vertical Federated Learning (DVFL) approach. The DVFL approach exploits a fully distributed architecture within each party in order to accelerate the training process. In addition, we exploit Homomorphic Encryption (HE) to protect the data against honest-but-curious participants. We conduct extensive experimentation in a large-scale cluster environment and a cloud environment in order to show the efficiency and scalability of our proposed approach. The experiments demonstrate the good scalability of our approach and the significant efficiency advantage (up to 6.8 times with a single server and 15.1 times with multiple servers in terms of the training time) compared with baseline frameworks.
翻译:近年来,数据通常在多个组织中分配,而数据安全则越来越重要。联邦学习组织(FL)使多方能够合作培训模型而无需交换原始数据,它吸引了越来越多的关注。根据数据分布,FL可以在三种情况下实现,即横向、纵向和混合。我们建议将分布式机器学习技术与垂直FL结合起来,并提出一个分布式垂直联邦学习(DVFL)方法。DVFL方法利用了各方内部一个完全分布的架构,以加快培训进程。此外,我们利用单方加密(HE)来保护数据不受诚实但有说服力的参与者的影响。我们在大型集群环境和云层环境中进行广泛的实验,以显示我们拟议方法的效率和可扩展性。实验表明,我们的方法与基线框架相比,具有相当大的效率优势(最多6.8次使用单一服务器,15.1次使用多个服务器)。</s>