We propose AIM, a novel algorithm for differentially private synthetic data generation. \aim is a workload-adaptive algorithm, within the paradigm of algorithms that first selects a set of queries, then privately measures those queries, and finally generates synthetic data from the noisy measurements. It uses a set of innovative features to iteratively select the most useful measurements, reflecting both their relevance to the workload and their value in approximating the input data. We also provide analytic expressions to bound per-query error with high probability, which can be used to construct confidence intervals and inform users about the accuracy of generated data. We show empirically that AIM consistently outperforms a wide variety of existing mechanisms across a variety of experimental settings.
翻译:我们建议采用AIM算法,用于不同程度的私人合成数据生成。\aim是一种工作量适应算法,在首先选择一组查询的算法范式内,然后私下测量这些查询,最后从噪音测量中生成合成数据。它使用一套创新特征来迭接选择最有用的测量方法,既反映其与工作量的相关性,也反映其在接近输入数据时的价值。我们还提供了高概率的封闭式单口错误的分析表达法,可用于建立信任间隔,并告知用户所生成数据的准确性。我们从经验上表明,AIM在各种实验环境中始终超越各种现有机制。