Nowadays, the industrial Internet of Things (IIoT) has played an integral role in Industry 4.0 and produced massive amounts of data for industrial intelligence. These data locate on decentralized devices in modern factories. To protect the confidentiality of industrial data, federated learning (FL) was introduced to collaboratively train shared machine learning models. However, the local data collected by different devices skew in class distribution and degrade industrial FL performance. This challenge has been widely studied at the mobile edge, but they ignored the rapidly changing streaming data and clustering nature of factory devices, and more seriously, they may threaten data security. In this paper, we propose FedGS, which is a hierarchical cloud-edge-end FL framework for 5G empowered industries, to improve industrial FL performance on non-i.i.d. data. Taking advantage of naturally clustered factory devices, FedGS uses a gradient-based binary permutation algorithm (GBP-CS) to select a subset of devices within each factory and build homogeneous super nodes participating in FL training. Then, we propose a compound-step synchronization protocol to coordinate the training process within and among these super nodes, which shows great robustness against data heterogeneity. The proposed methods are time-efficient and can adapt to dynamic environments, without exposing confidential industrial data in risky manipulation. We prove that FedGS has better convergence performance than FedAvg and give a relaxed condition under which FedGS is more communication-efficient. Extensive experiments show that FedGS improves accuracy by 3.5% and reduces training rounds by 59% on average, confirming its superior effectiveness and efficiency on non-i.i.d. data.
翻译:目前,工业物业互联网(IIoT)在工业4.0中发挥着不可或缺的作用,并产生了大量数据用于工业情报。这些数据位于现代工厂的分散设备上。为了保护工业数据的保密性,引入了联合学习(FL)来合作培训共享的机器学习模式。然而,不同装置收集的本地数据在阶级分布上扭曲,并降低了工业FL性能。在移动边缘广泛研究了这一挑战,但它们忽视了工厂设备迅速变化的流数据和集群性质,更严重的是,它们可能威胁到数据安全。在本文件中,我们提议FedGS(这是5G授权行业的顶端云端FL级的云端FL框架),目的是保护工业数据保密性,从而在非i.i.d.数据上提高工业FL的性能。 利用自然集群的工厂设备收集的基于梯度的二进制通算法(GBP-CS)来选择每家工厂内的一组装置,并在FLV培训中建立相同的超级节点。然后,我们提议一个复合同步协议来协调培训过程,在5G授权的工业工业企业内部和中间的顶端点之间,以更高的节能性数据显示高度数据,在不可靠度上,这能性数据,在安全性能环境之下,这可以证明,在安全性能的精确性能数据,在FFFDE操纵之下,这下,这能能能能数据是比。