Federated learning (FL) and split learning (SL) are state-of-the-art distributed machine learning techniques to enable machine learning training without accessing raw data on clients or end devices. However, their \emph{comparative training performance} under real-world resource-restricted Internet of Things (IoT) device settings, e.g., Raspberry Pi, remains barely studied, which, to our knowledge, have not yet been evaluated and compared, rendering inconvenient reference for practitioners. This work firstly provides empirical comparisons of FL and SL in real-world IoT settings regarding (i) learning performance with heterogeneous data distributions and (ii) on-device execution overhead. Our analyses in this work demonstrate that the learning performance of SL is better than FL under an imbalanced data distribution but worse than FL under an extreme non-IID data distribution. Recently, FL and SL are combined to form splitfed learning (SFL) to leverage each of their benefits (e.g., parallel training of FL and lightweight on-device computation requirement of SL). This work then considers FL, SL, and SFL, and mount them on Raspberry Pi devices to evaluate their performance, including training time, communication overhead, power consumption, and memory usage. Besides evaluations, we apply two optimizations. Firstly, we generalize SFL by carefully examining the possibility of a hybrid type of model training at the server-side. The generalized SFL merges sequential (dependent) and parallel (independent) processes of model training and is thus beneficial for a system with large-scaled IoT devices, specifically at the server-side operations. Secondly, we propose pragmatic techniques to substantially reduce the communication overhead by up to four times for the SL and (generalized) SFL.
翻译:联邦学习(FL)和分解学习(SL)是最先进的分布式机器学习技术,使机器学习培训能够在不获取客户或终端设备原始数据的情况下进行,但是,在现实世界资源限制的Things(IoT)互联网设备设置(例如Raspberry Pi)下,他们的学习成绩比FL要好,但在极端非IID数据分布下,Single Pi的学习成绩比FL要差。据我们所知,Faspberry Pi还没有得到评估和比较,给从业人员带来不便的参考。这项工作首先在现实世界IoT环境中对FL和SL的平行学习技术进行经验性比较,以便(一) 学习不同数据分布式数据的学习业绩,以及(二) 执行管理工具执行管理。 这项工作然后将SFLL的学习成绩比FL好, SL 数据分配到极端非IID的数据分配。 最近,FL和SL的分类学习模式组合,以便利用它们的每一项好处(例如,对FL的实用性和轻度计算计算方法),在Slifliflieral 系统上进行双级的学习。