为基于Ciphertext的跨西亚联邦学习组织提供计算效率高的深入示范培训 (Computation-efficient Deep Model Training for Ciphertext-based Cross-silo Federated Learning)

Although cross-silo federated learning improves privacy of training data by exchanging model updates rather than raw data, sharing updates (e.g., local gradients or parameters) may still involve risks. To ensure no updates are revealed to the server, industrial FL schemes allow clients (e.g., financial or medical) to mask local gradients by homomorphic encryption (HE). In this case, the server cannot obtain the updates, but the curious clients can obtain this information to infer other clients' private data. To alleviate this situation, the most direct idea is to let clients train deep models on encrypted domain. Unfortunately, the resulting solution is of poor accuracy and high cost, since the existing advanced HE is incompatible with non-linear activation functions and inefficient in terms of computational cost. In this paper, we propose a \emph{computational-efficient deep model training scheme for ciphertext-based cross-silo federated learning} to comprehensively guarantee privacy. First, we customize \emph{a novel one-time-pad-style model encryption method} to directly supports non-linear activation functions and decimal arithmetic operations on the encrypted domain. Then, we design a hybrid privacy-preserving scheme by combining our model encryption method with secret sharing techniques to keep updates secret from the clients and prevent the server from obtaining local gradients of each client. Extensive experiments demonstrate that for both regression and classification tasks, our scheme achieves the same accuracy as non-private approaches and outperforms the state-of-the-art HE-based scheme. Besides, training time of our scheme is almost the same as non-private approaches and much more efficient than HE-based schemes. Our scheme trains a $9$-layer neural network on the MNIST dataset in less than one hour.

翻译：虽然跨银河联盟的学习通过交换模式更新而不是原始数据来改善培训数据的隐私性,但通过交换模型更新而不是原始数据,共享更新(例如本地梯度或参数)仍可能涉及风险。为确保服务器不披露更新信息,工业FL计划允许客户(例如财务或医疗)通过同质加密来掩盖本地梯度。在此情况下,服务器无法获得更新,但好奇的客户可以获取这些信息来推断其他客户的私人数据。为了缓解这种情况,最直接的想法是让客户在加密域内培养深层模型。不幸的是,由此产生的解决方案的准确性和高成本都很低,因为现有的高级 HE 与非线性激活功能不相容,计算成本效率低。在本文件中,我们提议为copletext-complete-poil化的跨线性学习提供一套深层次培训计划来全面保障隐私性。首先,我们自定义了基于私人隐私的双向式计算机化方法,而基于新式计算机的超时制模型加密加密加密方法,直接支持非线级服务器升级的服务器操作和升级方法,在我们的每部服务器上更新我们的加密系统。