The last few years have seen a surge of work on high dimensional statistics under privacy constraints, mostly following two main lines of work: the ``worst case'' line, which does not make any distributional assumptions on the input data; and the ``strong assumptions'' line, which assumes that the data is generated from specific families, e.g., subgaussian distributions. In this work we take a middle ground, obtaining new differentially private algorithms with polynomial sample complexity for estimating quantiles in high-dimensions, as well as estimating and sampling points of high Tukey depth, all working under very mild distributional assumptions. From the technical perspective, our work relies upon deep robustness results in the convex geometry literature, demonstrating how such results can be used in a private context. Our main object of interest is the (convex) floating body (FB), a notion going back to Archimedes, which is a robust and well studied high-dimensional analogue of the interquantile range. We show how one can privately, and with polynomially many samples, (a) output an approximate interior point of the FB -- e.g., ``a typical user'' in a high-dimensional database -- by leveraging the robustness of the Steiner point of the FB; and at the expense of polynomially many more samples, (b) produce an approximate uniform sample from the FB, by constructing a private noisy projection oracle.
翻译:在过去几年里,在隐私限制下,关于高维统计的工作激增,主要是在两种主要工作线上:“worst case' 线,对输入数据没有做出任何分布性假设;以及“strong suppose' 线,假设数据来自特定家庭,例如亚高空分布。在这项工作中,我们采取了中间立场,获得了具有多元抽样复杂性的新的有差异的私人算法,以估算高二度的孔径,以及估算和取样高塔基深度的点,这些点都是在非常温和的分配假设下工作。从技术角度讲,我们的工作依赖于对convex几何数据进行任何分配的假设;以及“坚固的假设”线,它假定数据来自特定家庭,例如亚低端(convex)漂浮体(FB),一个概念可以追溯到Archimedimeds,这是对高二维范围范围内的高度样本进行稳健和研究的高度类比。我们展示了如何私下、多基深度的多基点(blogy) 和多基级(a) 的精度样本中,用Fmal-imalalalalalal 数据库中,一个高端(a) F-toiming 的内压的内压点。