We introduce a new differentially private regression setting we call Private Regression in Multiple Outcomes (PRIMO), inspired the common situation where a data analyst wants to perform a set of $l$ regressions while preserving privacy, where the covariates $X$ are shared across all $l$ regressions, and each regression $i \in [l]$ has a different vector of outcomes $y_i$. While naively applying private linear regression techniques $l$ times leads to a $\sqrt{l}$ multiplicative increase in error over the standard linear regression setting, in Subsection $4.1$ we modify techniques based on sufficient statistics perturbation (SSP) to yield greatly improved dependence on $l$. In Subsection $4.2$ we prove an equivalence to the problem of privately releasing the answers to a special class of low-sensitivity queries we call inner product queries. Via this equivalence, we adapt the geometric projection-based methods from prior work on private query release to the PRIMO setting. Under the assumption the labels $Y$ are public, the projection gives improved results over the Gaussian mechanism when $n < l\sqrt{d}$, with no asymptotic dependence on $l$ in the error. In Subsection $4.3$ we study the complexity of our projection algorithm, and analyze a faster sub-sampling based variant in Subsection $4.4$. Finally in Section $5$ we apply our algorithms to the task of private genomic risk prediction for multiple phenotypes using data from the 1000 Genomes project. We find that for moderately large values of $l$ our techniques drastically improve the accuracy relative to both the naive baseline that uses existing private regression methods and our modified SSP algorithm that doesn't use the projection.
翻译:我们引入了一种新的有差别的私人回归,我们称之为“多个结果中的私人回归”(PRIMO),这启发了一种共同情况,即数据分析员想要在维护隐私的同时进行一套美元回归,而共同变差的美元则在所有美元回归中共享,而每次回归的美元[l]美元则有不同的结果矢量 $y_i。虽然我们天真地应用私人线性回归技术,导致标准线性回归设置错误的倍增 $=qrt{l} 倍增。在4.1分节中,我们根据足够统计数据的反复回归(SSP)来修改技术,从而大大改善对美元的依赖。在4.2分节中,我们证明,对低度查询的特殊类别给出答案的问题,我们称之为$y_i_i。虽然如此等值,我们将先前关于私人查询释放的工作的几何测算法方法调整为“美元 ” 。根据假设, 美元是公共的,我们根据充分的数据预测结果, 将数值的Gaborseral rial rial oral oralalal oral dal oral oraltial lax 也使用了“我们目前的美元” 。</s>