We study adaptive methods for differentially private convex optimization, proposing and analyzing differentially private variants of a Stochastic Gradient Descent (SGD) algorithm with adaptive stepsizes, as well as the AdaGrad algorithm. We provide upper bounds on the regret of both algorithms and show that the bounds are (worst-case) optimal. As a consequence of our development, we show that our private versions of AdaGrad outperform adaptive SGD, which in turn outperforms traditional SGD in scenarios with non-isotropic gradients where (non-private) Adagrad provably outperforms SGD. The major challenge is that the isotropic noise typically added for privacy dominates the signal in gradient geometry for high-dimensional problems; approaches to this that effectively optimize over lower-dimensional subspaces simply ignore the actual problems that varying gradient geometries introduce. In contrast, we study non-isotropic clipping and noise addition, developing a principled theoretical approach; the consequent procedures also enjoy significantly stronger empirical performance than prior approaches.
翻译:我们研究有差别的私人二次曲线优化的适应方法,提出并分析具有适应性阶梯和AdaGrad算法的软体渐变法(SGD)的有差别的私人变体,提出并分析具有适应性阶梯的软体渐变法和AdaGrad算法。我们提供了两种算法的遗憾的上限,并表明界限是(最差的)最佳的。由于我们的发展,我们显示我们的AdaGrad的私人版本优于适应性适应性 SGD,这反过来又优于传统的SGD,在(非私人)Adagrad可明显优于SGD的非异性梯度梯度的假想情况中,在(非私人)Adadadadad可明显优于SGD。主要挑战在于,在高维度问题的梯度几何测中,偏度测法中通常增加的偏移噪音在信号中占主导地位;为此有效优化低度子空间的方法只是忽略了不同的梯度所引入的实际问题。相比之下,我们研究非热带剪裁剪和噪音添加了原则理论方法;随后的程序也比以往的经验表现要大得多。