Vertical federated learning is a collaborative machine learning framework to train deep leaning models on vertically partitioned data with privacy-preservation. It attracts much attention both from academia and industry. Unfortunately, applying most existing vertical federated learning methods in real-world applications still faces two daunting challenges. First, most existing vertical federated learning methods have a strong assumption that at least one party holds the complete set of labels of all data samples, while this assumption is not satisfied in many practical scenarios, where labels are horizontally partitioned and the parties only hold partial labels. Existing vertical federated learning methods can only utilize partial labels, which may lead to inadequate model update in end-to-end backpropagation. Second, computational and communication resources vary in parties. Some parties with limited computational and communication resources will become the stragglers and slow down the convergence of training. Such straggler problem will be exaggerated in the scenarios of horizontally partitioned labels in vertical federated learning. To address these challenges, we propose a novel vertical federated learning framework named Cascade Vertical Federated Learning (CVFL) to fully utilize all horizontally partitioned labels to train neural networks with privacy-preservation. To mitigate the straggler problem, we design a novel optimization objective which can increase straggler's contribution to the trained models. We conduct a series of qualitative experiments to rigorously verify the effectiveness of CVFL. It is demonstrated that CVFL can achieve comparable performance (e.g., accuracy for classification tasks) with centralized training. The new optimization objective can further mitigate the straggler problem comparing with only using the asynchronous aggregation mechanism during training.
翻译:垂直粘合式学习是一个合作的机器学习框架,用于在垂直分割数据上培养深缩缩缩模型,并保护隐私。它吸引学术界和业界的极大关注。 不幸的是,在现实世界应用中应用大多数现有的纵向联合学习方法仍面临两个严峻的挑战。 首先,大多数现有的垂直粘合式学习方法有一个强烈的假设,即至少有一方持有所有数据样本的完整标签,而在许多实际情景中,这一假设并不令人满意,因为标签是横向分割的,各方只持有部分分类。现有的垂直粘合式学习方法只能使用部分标签,这可能导致端对端的反反向调整中不完全更新模型。第二,计算和通信资源有限的一些计算和通信资源将变得松散,并放慢培训的趋同速度。在垂直粘合式学习中,这种紧凑的问题将在横向分割标签的情景中被夸大。为了应对这些挑战,我们提出一个新的垂直粘合式学习框架,即直线式硬性硬性硬性刻式学习(CVFLLLL),这可能导致在端对等直径直径直径的实验中进行不适当的升级的升级的升级的升级的升级的升级校正缩校正校正校正校略性学习。 将降低校正升级的校正的校正升级升级升级升级升级升级的校正升级的校正升级的校正升级升级升级升级升级升级升级升级升级升级升级升级升级升级升级升级的校正能升级升级升级升级升级升级升级升级升级的升级升级升级的升级升级升级升级升级升级升级升级,可以提高。