Recently, federated learning (FL) has received intensive research because of its ability in preserving data privacy for scattered clients to collaboratively train machine learning models. Commonly, a parameter server (PS) is deployed for aggregating model parameters contributed by different clients. Decentralized federated learning (DFL) is upgraded from FL which allows clients to aggregate model parameters with their neighbours directly. DFL is particularly feasible for vehicular networks as vehicles communicate with each other in a vehicle-to-vehicle (V2V) manner. However, due to the restrictions of vehicle routes and communication distances, it is hard for individual vehicles to sufficiently exchange models with others. Data sources contributing to models on individual vehicles may not diversified enough resulting in poor model accuracy. To address this problem, we propose the DFL-DDS (DFL with diversified Data Sources) algorithm to diversify data sources in DFL. Specifically, each vehicle maintains a state vector to record the contribution weight of each data source to its model. The Kullback-Leibler (KL) divergence is adopted to measure the diversity of a state vector. To boost the convergence of DFL, a vehicle tunes the aggregation weight of each data source by minimizing the KL divergence of its state vector, and its effectiveness in diversifying data sources can be theoretically proved. Finally, the superiority of DFL-DDS is evaluated by extensive experiments (with MNIST and CIFAR-10 datasets) which demonstrate that DFL-DDS can accelerate the convergence of DFL and improve the model accuracy significantly compared with state-of-the-art baselines.
翻译:最近,联邦学习(FL)由于能够保护分散客户的数据隐私,从而能够合作培训机器学习模式,因此得到了深入的研究。通常,为汇总不同客户提供的模型参数,部署了参数服务器(PS),以汇总不同客户提供的模型参数;从FL升级了分散化的联邦学习(DFL),使客户能够与邻居直接汇总模型参数;DFL对于车辆以车辆对车辆(V2V)方式相互沟通,车辆网络尤其可行;然而,由于车辆路线和通信距离的限制,单个车辆很难与其他车辆充分交流模型;为单个车辆模型提供的数据源可能不够多样化,导致模型准确性差;为解决这一问题,我们建议DFL-DDS(DL,具有多样化数据源的DFL)算法使数字来源多样化。具体地说,每辆车都保留一个州矢量,记录每个数据源对模型的贡献权重。 采用Kullack-Liter(KL)模型改进了州矢量的差异度,以衡量州矢量的多样化。为了促进DFL的趋同卡(D-FL)的加速度,车辆对每一数据源的集值进行量化的高度数据量的趋同度进行最接近性评估,可以证明其数据源的矢量评估。