Split Learning (SL) is one promising variant of Federated Learning (FL), where the AI model is split and trained at the clients and the server collaboratively. By offloading the computation-intensive portions to the server, SL enables efficient model training on resource-constrained clients. Despite its booming applications, SL still lacks rigorous convergence analysis on non-IID data, which is critical for hyperparameter selection. In this paper, we first prove that SL exhibits an $\mathcal{O}(1/\sqrt{R})$ convergence rate for non-convex objectives on non-IID data, where $R$ is the number of total training rounds. The derived convergence results can facilitate understanding the effect of some crucial factors in SL (e.g., data heterogeneity and synchronization interval). Furthermore, comparing with the convergence result of FL, we show that the guarantee of SL is worse than FL in terms of training rounds on non-IID data. The experimental results verify our theory. More findings on the comparison between FL and SL in cross-device settings are also reported.
翻译:分解学习(SL)是联邦学习(FL)的一个很有希望的变体,在这个变体中,AI模式在客户和服务器上进行了分割和培训。通过将计算密集型部分卸到服务器上,SL为资源受限制客户提供了高效的模型培训。尽管其应用程序正在蓬勃发展,但SL仍然缺乏关于非IID数据的严格的趋同分析,这对超参数选择至关重要。在本文中,我们首先证明SL在非IID数据的培训回合中展示了美元=mathcal{O}(1/\ sqrt{R})$的非Convex目标非IID数据的趋同率。实验结果证实了我们的理论。还报告了关于跨平台环境中FL和SL之间比较的更多结论。