We present a highly effective algorithmic approach for generating $\varepsilon$-differentially private synthetic data in a bounded metric space with near-optimal utility guarantees under the 1-Wasserstein distance. In particular, for a dataset $X$ in the hypercube $[0,1]^d$, our algorithm generates synthetic dataset $Y$ such that the expected 1-Wasserstein distance between the empirical measure of $X$ and $Y$ is $O((\varepsilon n)^{-1/d})$ for $d\geq 2$, and is $O(\log^2(\varepsilon n)(\varepsilon n)^{-1})$ for $d=1$. The accuracy guarantee is optimal up to a constant factor for $d\geq 2$, and up to a logarithmic factor for $d=1$. Our algorithm has a fast running time of $O(\varepsilon n)$ for all $d\geq 1$ and demonstrates improved accuracy compared to the method in (Boedihardjo et al., 2022) for $d\geq 2$.
翻译:我们提出了一种非常有效的算法方法,用于在1瓦瑟斯坦1瓦瑟斯坦距离下以近最佳的效用保证在封闭的公用空间中产生美元和瓦塞西隆(varepsilon)美元之间的私人合成数据。特别是,对于超立方[$0,1,1 ⁇ d$]的数据集,我们的算法产生合成数据集Y$,因此,预期X美元和Y美元的经验计量值之间的1-瓦瑟斯坦(Wasserstein)距离是O(((varepsilon n) -1/d})美元,美元为2美元,美元为O(log_2(\varepsilon n)(\ varepsilon n) 美元=1美元。准确性保证是最佳的,最高为2美元不变系数,最高为1美元=1美元的对数系数。我们的算法快速运行时间是1美元(\varepsilon n),比(Boedhardjo et al. 2022美元)的计算法更精确。