What is the information leakage of an iterative randomized learning algorithm about its training data, when the internal state of the algorithm is \emph{private}? How much is the contribution of each specific training epoch to the information leakage through the released model? We study this problem for noisy gradient descent algorithms, and model the \emph{dynamics} of R\'enyi differential privacy loss throughout the training process. Our analysis traces a provably \emph{tight} bound on the R\'enyi divergence between the pair of probability distributions over parameters of models trained on neighboring datasets. We prove that the privacy loss converges exponentially fast, for smooth and strongly convex loss functions, which is a significant improvement over composition theorems (which over-estimate the privacy loss by upper-bounding its total value over all intermediate gradient computations). For Lipschitz, smooth, and strongly convex loss functions, we prove optimal utility with a small gradient complexity for noisy gradient descent algorithms.
翻译:当算法的内部状态是 \ emph{ private} 时, 有关其培训数据的迭代随机学习算法的信息泄漏是什么? 每一个具体培训过程对通过发布模型泄漏信息有何贡献? 我们研究这个关于超音梯梯级下降算法的问题,并在整个培训过程中模拟R'enyi差异性隐私损失的模型。 我们的分析在R\ emph{ tight} 中发现了一个可以辨别的\ emph{ tight}, 围绕R\ enyi 之间的概率分布与在相邻数据集中培训模型参数之间的差异。 我们证明,对于光滑和强烈的 convex损失功能来说,隐私损失会快速地聚集在一起,这是对组成标的显著改进(通过所有中间梯度计算中高调总价值高估了隐私损失)。 对于Lipschitz, 光滑和强烈的convex损失函数,我们证明,对于噪音梯度梯度下降的精度复杂性是最佳的。