Many application scenarios call for training a machine learning model among multiple participants. Federated learning (FL) was proposed to enable joint training of a deep learning model using the local data in each party without revealing the data to others. Among various types of FL methods, vertical FL is a category to handle data sources with the same ID space and different feature spaces. However, existing vertical FL methods suffer from limitations such as restrictive neural network structure, slow training speed, and often lack the ability to take advantage of data with unmatched IDs. In this work, we propose an FL method called self-taught federated learning to address the aforementioned issues, which uses unsupervised feature extraction techniques for distributed supervised deep learning tasks. In this method, only latent variables are transmitted to other parties for model training, while privacy is preserved by storing the data and parameters of activations, weights, and biases locally. Extensive experiments are performed to evaluate and demonstrate the validity and efficiency of the proposed method.
翻译:许多应用情景都要求在多个参与者中培训机器学习模式。 联邦学习(FL)是为了在不向其他人透露数据的情况下,利用各方的当地数据,对深层次学习模式进行联合培训; 在各种FL方法中,垂直FL是处理数据源的类别,具有相同的ID空间和不同的特征空间; 但是,现有的垂直FL方法受到限制,如限制性神经网络结构、缓慢的培训速度,而且往往缺乏利用与不匹配的ID数据的能力。 在这项工作中,我们提议了一种FL方法,称为自学联合学习,以解决上述问题,这种方法使用不受监督的特征提取技术来分散受监督的深层学习任务。在这种方法中,只有潜在的变量被传送到其他各方进行模型培训,而隐私则通过储存激活、重量和偏差的数据和参数而得以保留。我们进行了广泛的实验,以评估和证明拟议方法的有效性和效率。