In this paper, we study differentially private empirical risk minimization (DP-ERM). It has been shown that the worst-case utility of DP-ERM reduces polynomially as the dimension increases. This is a major obstacle to privately learning large machine learning models. In high dimension, it is common for some model's parameters to carry more information than others. To exploit this, we propose a differentially private greedy coordinate descent (DP-GCD) algorithm. At each iteration, DP-GCD privately performs a coordinate-wise gradient step along the gradients' (approximately) greatest entry. We show theoretically that DP-GCD can achieve a logarithmic dependence on the dimension for a wide range of problems by naturally exploiting their structural properties (such as quasi-sparse solutions). We illustrate this behavior numerically, both on synthetic and real datasets.
翻译:在本文中,我们研究了差分隐私经验风险最小化(DP-ERM)。已经证明,在维度增加时,DP-ERM的最坏效用会多项式降低。这是隐私学习大型机器学习模型的主要障碍。在高维度中,一些模型参数所携带的信息比其他参数更多是很常见的。为了利用这一点,我们提出了一个差分私有贪婪坐标下降(DP-GCD)算法。在每次迭代中,DP-GCD沿着梯度(近似)最大的条目执行一次旋转梯度步长。我们理论上表明,在一定范围内的问题中,DP-GCD可以实现对维度的对数依赖,并自然地利用它们的结构性质(如准稀疏解)。我们在合成和真实数据集上通过数值实验来说明这种行为。