Split learning (SL) is a privacy-preserving distributed deep learning method used to train a collaborative model without the need for sharing of patient's raw data between clients. In split learning, an additional privacy-preserving algorithm called no-peek algorithm can be incorporated, which is robust to adversarial attacks. The privacy benefits offered by split learning make it suitable for practice in the healthcare domain. However, the split learning algorithm is flawed as the collaborative model is trained sequentially, i.e., one client trains after the other. We point out that the model trained using the split learning algorithm gets biased towards the data of the clients used for training towards the end of a round. This makes SL algorithms highly susceptible to the order in which clients are considered for training. We demonstrate that the model trained using the data of all clients does not perform well on the client's data which was considered earliest in a round for training the model. Moreover, we show that this effect becomes more prominent with the increase in the number of clients. We also demonstrate that the SplitFedv3 algorithm mitigates this problem while still leveraging the privacy benefits provided by split learning.
翻译:分割学习( SL) 是用于培训合作模式的保密分布式深层次学习方法, 无需在客户之间分享病人的原始数据。 在拆分学习中, 还可以纳入一个称为无语言算法的隐私保护算法, 这对于对抗性攻击来说是有力的。 拆分学习提供的隐私利益使得它适合在医疗保健领域实践。 但是, 分割学习算法存在缺陷, 因为合作模式是按顺序连续训练的, 即一个客户列车。 我们指出, 使用拆分学习算法所培训的模型偏向于用于培训的客户的数据。 这让Sl算法极易受到考虑客户接受培训的顺序。 我们证明, 使用所有客户的数据所培训的模型对客户的数据没有很好地表现在培训模型的最早一轮中考虑的数据上。 此外, 我们显示, 随着客户数量的增加, 分解Fedv3 算法会更加突出这种效果。 我们还证明, 使用分解法算法在利用分解学习提供的隐私利益的同时减轻了这个问题。