What is the information leakage of an iterative learning algorithm about its training data, when the internal state of the algorithm is \emph{not} observable? How much is the contribution of each specific training epoch to the final leakage? We study this problem for noisy gradient descent algorithms, and model the \emph{dynamics} of R\'enyi differential privacy loss throughout the training process. Our analysis traces a provably tight bound on the R\'enyi divergence between the pair of probability distributions over parameters of models with neighboring datasets. We prove that the privacy loss converges exponentially fast, for smooth and strongly convex loss functions, which is a significant improvement over composition theorems. For Lipschitz, smooth, and strongly convex loss functions, we prove optimal utility for differential privacy algorithms with a small gradient complexity.
翻译:当算法的内部状态为可观测到时,有关其培训数据的迭代学习算法所泄漏的信息是什么? 每种具体培训过程对最终渗漏的贡献有多大? 我们研究的是吵闹的梯度下降算法问题,并在整个培训过程中模拟R\'enyi差异性隐私损失的模型。 我们的分析发现R\'enyi的概率分布与模型参数和相邻数据集之间的差很接近。 我们证明,隐私损失的汇合速度很快,对于平稳和强烈的螺旋损失功能来说,这是对组成论的重大改进。 对于Lipschitz,我们顺利和强烈的螺旋损失函数,我们证明,对于具有小梯度复杂性的差别隐私算法来说,我们最有实用性。