The wide deployment of machine learning in recent years gives rise to a great demand for large-scale and high-dimensional data, for which the privacy raises serious concern. Differential privacy (DP) mechanisms are conventionally developed for scalar values, not for structural data like matrices. Our work proposes Improved Matrix Gaussian Mechanism (IMGM) for matrix-valued DP, based on the necessary and sufficient condition of $ (\varepsilon,\delta) $-differential privacy. IMGM only imposes constraints on the singular values of the covariance matrices of the noise, which leaves room for design. Among the legitimate noise distributions for matrix-valued DP, we find the optimal one turns out to be i.i.d. Gaussian noise, and the DP constraint becomes a noise lower bound on each element. We further derive a tight composition method for IMGM. Apart from the theoretical analysis, experiments on a variety of models and datasets also verify that IMGM yields much higher utility than the state-of-the-art mechanisms at the same privacy guarantee.
翻译:近年来,机器学习的广泛应用引起了对大型和高维数据的巨大需求,对此隐私引起了严重的关注。不同隐私机制(DP)是传统地为卡路里值开发的,而不是为矩阵等结构数据开发的。我们的工作提议改进Gaussian 矩阵机制(IMGM),用于基盘估值的DP,其必要和充分条件为$(varepsilon,\delta) 美元差异性隐私。IMGM只对噪音变量的单值设置了限制,这为设计留出了空间。在矩阵估值DP的合法噪音分布中,我们发现最理想的就是“i.d.Gaussian噪音”,而DP的制约使每个要素的噪音限制降低。我们进一步从理论分析之外,对各种模型和数据集的实验还证实IMGM在同样的隐私保障下,其效用大大高于状态机制。