Secure multi-party computation-based machine learning, referred to as MPL, has become an important technology to utilize data from multiple parties with privacy preservation. While MPL provides rigorous security guarantees for the computation process, the models trained by MPL are still vulnerable to attacks that solely depend on access to the models. Differential privacy could help to defend against such attacks. However, the accuracy loss brought by differential privacy and the huge communication overhead of secure multi-party computation protocols make it highly challenging to balance the 3-way trade-off between privacy, efficiency, and accuracy. In this paper, we are motivated to resolve the above issue by proposing a solution, referred to as PEA (Private, Efficient, Accurate), which consists of a secure DPSGD protocol and two optimization methods. First, we propose a secure DPSGD protocol to enforce DPSGD in secret sharing-based MPL frameworks. Second, to reduce the accuracy loss led by differential privacy noise and the huge communication overhead of MPL, we propose two optimization methods for the training process of MPL: (1) the data-independent feature extraction method, which aims to simplify the trained model structure; (2) the local data-based global model initialization method, which aims to speed up the convergence of the model training. We implement PEA in two open-source MPL frameworks: TF-Encrypted and Queqiao. The experimental results on various datasets demonstrate the efficiency and effectiveness of PEA. E.g. when ${\epsilon}$ = 2, we can train a differentially private classification model with an accuracy of 88% for CIFAR-10 within 7 minutes under the LAN setting. This result significantly outperforms the one from CryptGPU, one SOTA MPL framework: it costs more than 16 hours to train a non-private deep neural network model on CIFAR-10 with the same accuracy.
翻译:多党安全计算式机器学习(称为MPL)已成为利用多方隐私保护数据的重要技术。虽然MPL为计算过程提供了严格的安全保障,但MPL所培训的模式仍然容易受到完全依赖模型访问的攻击。不同的隐私有助于防范这种攻击。然而,由于隐私差异和多党安全计算协议的巨大通信费造成的准确性损失,使得平衡隐私、效率和准确性之间的三方交易变得非常困难。在本文中,我们有动力通过提出一个解决方案来解决上述问题,称为PEA(PI、Peality、Acccurate),由安全的DPSGD协议和两种优化方法组成。首先,我们提出一个安全的DPSGD协议,在秘密共享的MPL框架中执行DPSGD。第二,由于隐私差异和多党保密协议的巨大通信费,我们提出了两种最优化方法,用于MPL培训过程:(1) 数据独立提取方法,目的是简化模型结构(PI、PP-CR)的实效;(2) 在PI-CRAL模型框架下,一个基于当地数据-CFIL的模型,可以展示一个成本模型。