We revisit the task of computing the span of the top $r$ singular vectors $u_1, \ldots, u_r$ of a matrix under differential privacy. We show that a simple and efficient algorithm -- based on singular value decomposition and standard perturbation mechanisms -- returns a private rank-$r$ approximation whose error depends only on the \emph{rank-$r$ coherence} of $u_1, \ldots, u_r$ and the spectral gap $\sigma_r - \sigma_{r+1}$. This resolves a question posed by Hardt and Roth~\cite{hardt2013beyond}. Our estimator outperforms the state of the art -- significantly so in some regimes. In particular, we show that in the dense setting, it achieves the same guarantees for single-spike PCA in the Wishart model as those attained by optimal non-private algorithms, whereas prior private algorithms failed to do so. In addition, we prove that (rank-$r$) coherence does not increase under Gaussian perturbations. This implies that any estimator based on the Gaussian mechanism -- including ours -- preserves the coherence of the input. We conjecture that similar behavior holds for other structured models, including planted problems in graphs. We also explore applications of coherence to graph problems. In particular, we present a differentially private algorithm for Max-Cut and other constraint satisfaction problems under low coherence assumptions.
翻译:我们重新审视在差分隐私约束下计算矩阵前 $r$ 个奇异向量 $u_1, \\ldots, u_r$ 张成空间的任务。我们证明,一种基于奇异值分解和标准扰动机制的简单高效算法能够返回一个私有秩-$r$ 近似,其误差仅取决于 $u_1, \\ldots, u_r$ 的\\emph{秩-$r$ 相干性}以及谱间隙 $\\sigma_r - \\sigma_{r+1}$。这解决了 Hardt 和 Roth~\\cite{hardt2013beyond} 提出的问题。我们的估计器优于现有技术——在某些机制中显著更优。具体而言,我们表明在稠密设置下,该算法在 Wishart 模型中为单峰主成分分析实现了与最优非私有算法相同的保证,而先前的私有算法未能做到这一点。此外,我们证明(秩-$r$)相干性在高斯扰动下不会增加。这意味着任何基于高斯机制的估计器——包括我们的方法——都能保持输入的相干性。我们推测类似行为也适用于其他结构化模型,包括图中的植入问题。我们还探讨了相干性在图问题中的应用。特别地,我们提出了一种在低相干性假设下用于最大割问题及其他约束满足问题的差分隐私算法。