We address the problem of learning a machine learning model from training data that originates at multiple data owners while providing formal privacy guarantees regarding the protection of each owner's data. Existing solutions based on Differential Privacy (DP) achieve this at the cost of a drop in accuracy. Solutions based on Secure Multiparty Computation (MPC) do not incur such accuracy loss but leak information when the trained model is made publicly available. We propose an MPC solution for training DP models. Our solution relies on an MPC protocol for model training, and an MPC protocol for perturbing the trained model coefficients with Laplace noise in a privacy-preserving manner. The resulting MPC+DP approach achieves higher accuracy than a pure DP approach while providing the same formal privacy guarantees. Our work obtained first place in the iDASH2021 Track III competition on confidential computing for secure genome analysis.
翻译:我们从来自多个数据所有者的培训数据中学习机器学习模式,同时为保护每个所有者的数据提供正式的隐私保障。基于不同隐私的现有解决方案是以精确度下降为代价实现的。基于安全多方计算(MPC)的解决方案不会造成这种准确性损失,但在公开提供经过培训的模式时会造成泄漏信息。我们为培训DP模式提出了一个MPC解决方案。我们的解决方案依赖于模型培训的MPC协议,以及用于以保密方式以保密方式破坏经过培训的拉普尔噪音模型系数的MPC协议。由此产生的MPC+DP方法比纯粹的DP方法更准确,同时提供了同样的正式隐私保障。我们的工作在iDASH2021轨道三关于保密计算以安全基因组分析的竞赛中占据第一位。