Private data analysis suffers a costly curse of dimensionality. However, the data often has an underlying low-dimensional structure. For example, when optimizing via gradient descent, the gradients often lie in or near a low-dimensional subspace. If that low-dimensional structure can be identified, then we can avoid paying (in terms of privacy or accuracy) for the high ambient dimension. We present differentially private algorithms that take input data sampled from a low-dimensional linear subspace (possibly with a small amount of error) and output that subspace (or an approximation to it). These algorithms can serve as a pre-processing step for other procedures.
翻译:私人数据分析受到一个代价高昂的维度诅咒,然而,数据往往有一个低维结构。例如,当通过梯度下降优化时,梯度往往位于或接近一个低维次空间。如果能够确定这一低维结构,那么我们可以避免支付(在隐私或准确性方面)高环境维度的费用。我们提出不同的私人算法,从低维线性次空间取样输入数据(可能存在少量误差)和子空间输出数据(或近似于它)。这些算法可以作为其他程序的处理前步骤。