Real-time machine learning has recently attracted significant interest due to its potential to support instantaneous learning, adaptation, and decision making in a wide range of application domains, including self-driving vehicles, intelligent transportation, and industry automation. We investigate real-time ML in a federated edge intelligence (FEI) system, an edge computing system that implements federated learning (FL) solutions based on data samples collected and uploaded from decentralized data networks. FEI systems often exhibit heterogenous communication and computational resource distribution, as well as non-i.i.d. data samples, resulting in long model training time and inefficient resource utilization. Motivated by this fact, we propose a time-sensitive federated learning (TS-FL) framework to minimize the overall run-time for collaboratively training a shared ML model. Training acceleration solutions for both TS-FL with synchronous coordination (TS-FL-SC) and asynchronous coordination (TS-FL-ASC) are investigated. To address straggler effect in TS-FL-SC, we develop an analytical solution to characterize the impact of selecting different subsets of edge servers on the overall model training time. A server dropping-based solution is proposed to allow slow-performance edge servers to be removed from participating in model training if their impact on the resulting model accuracy is limited. A joint optimization algorithm is proposed to minimize the overall time consumption of model training by selecting participating edge servers, local epoch number. We develop an analytical expression to characterize the impact of staleness effect of asynchronous coordination and straggler effect of FL on the time consumption of TS-FL-ASC. Experimental results show that TS-FL-SC and TS-FL-ASC can provide up to 63% and 28% of reduction, in the overall model training time, respectively.
翻译:63号实时机器学习最近引起了极大的兴趣,因为它有可能支持包括自驾驶车辆、智能交通和工业自动化在内的广泛应用领域的即时学习、适应和决策,包括自驾驶车辆、智能交通和工业自动化。我们在一个联合边缘情报系统(FEI)中调查实时ML,这是一个边际计算系统,根据收集的数据样本和从分散的数据网络上传的数据样本,采用联式学习(FLI)解决方案;FEI系统常常显示异质通信和计算资源分配,以及非i.i.d.数据样本,导致长期示范培训时间过长和资源利用效率低下。我们为此提出一个对时间敏感的联动学习(TS-FL)框架框架,以尽量减少协作性地智能智能智能(TS-FL-SC)系统的总体运行时间运行时间,同时调查(TS-FL-ASC)系统模型模型和自动同步协调(TS-FL-FL-ASC)系统。为了在TS-S-FL-SL-SLS-SLS-SLSLSLSLSLS-Servical 和SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-S-S-SL-SL-S-S-SL-SL-S-S-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-SL-S-S-S-SL-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-