The test statistics for many nonparametric hypothesis tests can be expressed in terms of a pseudo-metric applied to the empirical cumulative distribution function (ecdf), such as Kolmogorov-Smirnov, Kuiper, Cram\'er-von Mises, and Wasserstein. These test statistics can be used to test goodness-of-fit, two-samples, paired data, or symmetry. For the design of differentially private (DP) versions of these tests, we show that test statistics of this form have small sensitivity, requiring a minimal amount of noise to achieve DP. The tests are also distribution-free, enabling accurate $p$-value calculations via Monte Carlo approximations. We show that in several settings, especially with small privacy budgets or heavy tailed data, our new DP tests outperform alternative nonparametric DP tests.
翻译:许多非参数假设试验的试验统计数字可以用适用于经验性累积分布功能(ecdf)的假数表示,例如Kolmogorov-Smirnov、Kuiper、Cram\'er-von Mises和Wasserstein。这些试验统计数字可用于测试适合的、双样的、配对的数据或对称性。在设计这些试验的不同私人(DP)版本时,我们表明这种形式的试验统计数字的敏感性很小,需要最低数量的噪音才能实现DP。这些试验也是无分发的,能够通过Monte Carlo近似s进行准确的美元价值计算。我们表明,在若干环境中,特别是利用小型的隐私预算或重型的尾巴数据,我们的新的DP测试超出了非参数性DP测试。