Synthetic data are an attractive concept to enable privacy in data sharing. A fundamental question is how similar the privacy-preserving synthetic data are compared to the true data. Using metric privacy, an effective generalization of differential privacy beyond the discrete setting, we raise the problem of characterizing the optimal privacy-accuracy tradeoff by the metric geometry of the underlying space. We provide a partial solution to this problem in terms of the "entropic scale", a quantity that captures the multiscale geometry of a metric space via the behavior of its packing numbers. We illustrate the applicability of our privacy-accuracy tradeoff framework via a diverse set of examples of metric spaces.
翻译:暂无翻译