Machine learning models can leak information about the data used to train them. Differentially Private (DP) variants of optimization algorithms like Stochastic Gradient Descent (DP-SGD) have been designed to mitigate this, inducing a trade-off between privacy and utility. In this paper, we propose a new method for composite Differentially Private Empirical Risk Minimization (DP-ERM): Differentially Private proximal Coordinate Descent (DP-CD). We analyze its utility through a novel theoretical analysis of inexact coordinate descent, and highlight some regimes where DP-CD outperforms DP-SGD, thanks to the possibility of using larger step sizes. We also prove new lower bounds for composite DP-ERM under coordinate-wise regularity assumptions, that are, in some settings, nearly matched by our algorithm. In practical implementations, the coordinate-wise nature of DP-CD updates demands special care in choosing the clipping thresholds used to bound individual contributions to the gradients. A natural parameterization of these thresholds emerges from our theory, limiting the addition of unnecessarily large noise without requiring coordinate-wise hyperparameter tuning or extra computational cost.
翻译:机器学习模型可以泄露用于培训这些数据的信息。 不同的私人( DP) 优化算法( 如Stochatic Gladient Empress (DP- SGD) 的变种( DP- SGD) 已经设计了不同的私人( DP- DP) 来缓解这种情况, 从而在隐私和实用性之间实现权衡。 在本文中,我们提出了一种新的方法, 用于综合的区分私人经验风险最小化( DP- ERM): 差异性私人最佳协调源( DP- CD) 。 我们通过对不精确的相近的理论分析, 分析其实用性, 并突显了DP- CD 优于 DP- SGD 的系统。 我们还证明, 在协调的常规性假设下, 综合的 DP- ERM 的新的下限值较低, 在某些环境下, 几乎与我们的算法相匹配 。 在实际实施中, DP- CD 协调性更新需要特别小心选择将个人贡献与梯度相连接的剪切的阈值阈值。 这些阈值的临界值的临界值。 这些阈值的自然参数从我们的理论中产生,, 限制不必要大噪值的添加不必要大噪值, 。