Vertical federated learning (VFL), where data features are stored in multiple parties distributively, is an important area in machine learning. However, the communication complexity for VFL is typically very high. In this paper, we propose a unified framework by constructing coresets in a distributed fashion for communication-efficient VFL. We study two important learning tasks in the VFL setting: regularized linear regression and $k$-means clustering, and apply our coreset framework to both problems. We theoretically show that using coresets can drastically alleviate the communication complexity, while nearly maintain the solution quality. Numerical experiments are conducted to corroborate our theoretical findings.
翻译:垂直联合学习(VFL)数据特征储存于多个方,是机器学习的一个重要领域,然而,VFL的通信复杂性通常非常高。在本文件中,我们提出一个统一框架,以分布的方式构建通信效率高的VFL核心数据集。我们在VFL设置中研究两项重要的学习任务:常规线性回归和美元汇率组合,并将我们的核心组合框架应用于这两个问题。我们理论上表明,使用核心组合可以大幅缓解通信复杂性,同时几乎保持解决方案的质量。我们进行了数量实验,以证实我们的理论结论。