Split learning (SL) aims to protect user data privacy by splitting deep models between client-server and keeping private data locally. SL has been demonstrated to achieve similar accuracy as the centralized learning model. In SL with multiple clients, the local training weights are shared between clients for local model aggregation. This paper investigates the potential of data leakage due to local weight sharing among the clients in SL by performing model inversion attacks. To mitigate the identified leakage issue, we propose and analyze privacy-enhancement SL (P-SL), e.g., SL without local weight sharing, to boost client-side data privacy. We also propose paralleled P-SL to speed up the training process by employing multiple servers without accuracy reduction. Finally, we investigate P-SL with late participating clients and develop a server-based cache-based training to address the forgetting phenomenon in SL. Experimental results demonstrate that P-SL helps reduce up to 50% of client-side data leakage compared to SL. Moreover, P-SL and its cache-based version achieve comparable accuracy to SL under various data distributions with lower computation and communications costs. Also, caching in P-SL reduces the negative effect of forgetting, stabilizes the learning, and enables effective and low-complexity training in a dynamic environment with late-arriving clients.
翻译:分解学习(SL)的目的是通过在客户服务器之间分割深层模型和在当地保存私人数据来保护用户数据隐私。 SL已经证明能够实现与集中学习模型相似的准确性。在与多个客户的SL中,当地培训权重由当地模式汇总的客户共享。本文调查了由于SL客户通过执行模式反向袭击在本地分享权重而导致数据泄漏的可能性。为了减轻已查明的泄漏问题,我们提议并分析隐私增强 SL(P-SL),例如,不共享本地权重的SL,以提升客户的隐私数据隐私。我们还提议P-SL通过在不降低准确性的情况下使用多个服务器来加快培训进程。最后,我们调查P-SL(P-SL)与迟交的客户进行调查,并开发基于服务器的缓存性培训,以解决SL. 实验结果显示,P-SL(P-SL)与SL(P-SL)相比,帮助将客户方数据渗漏率降低50%。此外,P-SL(SL)及其缓存版本在不同的数据分发中实现与SL的可比的准确性,同时降低计算和通信成本,使客户在晚学习过程中的负面影响。