Recently, federated learning (FL) has emerged as a promising distributed machine learning (ML) technology, owing to the advancing computational and sensing capacities of end-user devices, however with the increasing concerns on users' privacy. As a special architecture in FL, vertical FL (VFL) is capable of constructing a hyper ML model by embracing sub-models from different clients. These sub-models are trained locally by vertically partitioned data with distinct attributes. Therefore, the design of VFL is fundamentally different from that of conventional FL, raising new and unique research issues. In this paper, we aim to discuss key challenges in VFL with effective solutions, and conduct experiments on real-life datasets to shed light on these issues. Specifically, we first propose a general framework on VFL, and highlight the key differences between VFL and conventional FL. Then, we discuss research challenges rooted in VFL systems under four aspects, i.e., security and privacy risks, expensive computation and communication costs, possible structural damage caused by model splitting, and system heterogeneity. Afterwards, we develop solutions to addressing the aforementioned challenges, and conduct extensive experiments to showcase the effectiveness of our proposed solutions.
翻译:最近,由于终端用户装置的计算和感测能力不断提高,但用户隐私问题日益受到关注,联合会学习(FL)已成为很有希望的分布式机器学习(ML)技术。作为FL的一个特殊架构,垂直FL(VFL)能够容纳不同客户的子模型,从而构建超超ML模型。这些子模型在当地通过具有不同属性的垂直分割数据培训。因此,VFL的设计与常规FL的设计有根本的不同,提出了新的和独特的研究问题。在本文件中,我们的目的是讨论VFL的关键性挑战,提出有效的解决方案,并进行实际生活数据集实验,以揭示这些问题。具体地说,我们首先提出一个关于VFLFL的总体框架,并突出VFL和常规FL之间的关键差异。然后,我们从四个方面,即安全和隐私风险、昂贵的计算和通信费用、模式分裂可能造成的结构损害以及系统繁杂性等方面,讨论源于VFLL系统的研究挑战。随后,我们制定了解决上述挑战的解决方案,并进行广泛的实验,以展示我们提出的解决方案的有效性。