In this paper we revisit the problem of differentially private empirical risk minimization (DP-ERM) and differentially private stochastic convex optimization (DP-SCO). We show that a well-studied continuous time algorithm from statistical physics, called Langevin diffusion (LD), simultaneously provides optimal privacy/utility trade-offs for both DP-ERM and DP-SCO, under $\epsilon$-DP, and $(\epsilon,\delta)$-DP both for convex and strongly convex loss functions. We provide new time and dimension independent uniform stability properties of LD, using with we provide the corresponding optimal excess population risk guarantees for $\epsilon$-DP. An important attribute of our DP-SCO guarantees for $\epsilon$-DP is that they match the non-private optimal bounds as $\epsilon\to\infty$. Along the way, we provide various technical tools, which can be of independent interest: i) A new R\'enyi divergence bound for LD, when run on loss functions over two neighboring data sets, ii) Excess empirical risk bounds for last-iterate LD, analogous to that of Shamir and Zhang for noisy stochastic gradient descent (SGD), and iii) A two phase excess risk analysis of LD, where the first phase is when the diffusion has not converged in any reasonable sense to a stationary distribution, and in the second phase when the diffusion has converged to a variant of Gibbs distribution. Our universality results crucially rely on the dynamics of LD. When it has converged to a stationary distribution, we obtain the optimal bounds under $\epsilon$-DP. When it is run only for a very short time $\propto 1/p$, we obtain the optimal bounds under $(\epsilon,\delta)$-DP. Here, $p$ is the dimensionality of the model space.
翻译:在本文中,我们重新审视了不同私人经验风险最小化(DP-ERM)和差异私人统一稳定化(DP-SCO)问题。我们从统计物理学(称为Langevin 扩散(LD))中提供经过仔细研究的连续时间算法,同时为DP-ERM和DP-SCO提供最佳的隐私/使用权权衡,在$\epsilon美元-DP和$(efsilon,\delta)美元-DP用于调控和强烈的Convex损失功能。我们为LD提供了新的独立时间和层面的统一稳定性(DP-SCO)提供了新的时间和层面,同时我们为美元提供了相应的普遍性过剩人口风险保证。 DP-SCO保证美元扩散(LD)同时为DP-ERM 和DP-SLM 提供了最优化的隐私/使用权交易权交易权,在SAL-RD 流流化数据周期中,我们只能从最短的流化的流化的流化到流化的流化的流到流化的流化的流化的流到流化的流化的流化。