Differential privacy is the de facto standard for protecting privacy in a variety of applications. One of the key challenges is private data release, which is particularly relevant in scenarios where limited information about the desired statistics is available beforehand. Recent work has presented a differentially private data release algorithm that achieves optimal rates of order $n^{-1/d}$, with $n$ being the size of the dataset and $d$ being the dimension, for the worst-case error over all Lipschitz continuous statistics. This type of guarantee is desirable in many practical applications, as for instance it ensures that clusters present in the data are preserved. However, due to the "slow" rates, it is often infeasible in practice unless the dimension of the data is small. We demonstrate that these rates can be significantly improved to $n^{-1/s}$ when only guarantees over s-sparse Lipschitz continuous functions are required, or to $n^{-1/(s+1)}$ when the data lies on an unknown s-dimensional subspace, disregarding logarithmic factors. We therefore obtain practically meaningful rates for moderate constants $s$ which motivates future work on computationally efficient approximate algorithms for this~problem.
翻译:不同隐私是各种应用中保护隐私的实际标准。关键的挑战之一是私人数据发布,这在事先获得关于所需统计数据的信息有限的情况下特别相关。最近的工作提出了一种差别化的私人数据发布算法,这种算法达到最佳的排序 $n ⁇ -1/d}美元,其中美元为数据集的大小,美元为维度,在所有Lipschitz连续统计中最差的错误中,美元为维度最差的。这种保证在许多实际应用中是可取的,例如,它确保数据中的组群得到保存。然而,由于“低”率,除非数据规模小,否则这种数据在实际中往往不可行。我们证明,只要需要保证S-sparch Lipschitz连续功能,或当数据位于未知的平面次空间,而无视对数因素,这些数据就等于$-1/(s+1)}美元。因此,我们为中值恒值获得实际有意义的率,美元,这能激励未来高效的算算算法近。