In this paper, we study differentially private empirical risk minimization (DP-ERM). It has been shown that the (worst-case) utility of DP-ERM reduces as the dimension increases. This is a major obstacle to privately learning large machine learning models. In high dimension, it is common for some model's parameters to carry more information than others. To exploit this, we propose a differentially private greedy coordinate descent (DP-GCD) algorithm. At each iteration, DP-GCD privately performs a coordinate-wise gradient step along the gradients' (approximately) greatest entry. We show theoretically that DP-GCD can improve utility by exploiting structural properties of the problem's solution (such as sparsity or quasi-sparsity), with very fast progress in early iterations. We then illustrate this numerically, both on synthetic and real datasets. Finally, we describe promising directions for future work.
翻译:在本文中,我们研究了差别化的私人经验风险最小化(DP-EMM),已经表明DP-EMM的(最差的)效用随着尺寸的增加而减少。这是私人学习大型机器学习模型的主要障碍。在高维方面,某些模型参数通常会携带比其他参数更多的信息。为了利用这一点,我们建议采用差别化的私人贪婪协调血统算法。在每一次迭代中,DP-GCD私下沿着(约)最大的梯度沿梯度的(约)最大条目执行一个协调的梯度步骤。我们从理论上表明,DP-GCD能够通过利用问题解决办法的结构特性(如宽度或准等)来改进效用,早期迭代进展非常快。我们然后用数字来说明这种在合成和真实数据集上的情况。最后,我们描述了未来工作的有希望的方向。