$f$-divergences, which quantify discrepancy between probability distributions, are ubiquitous in information theory, machine learning, and statistics. While there are numerous methods for estimating $f$-divergences from data, a limit distribution theory, which quantifies fluctuations of the estimation error, is largely obscure. As limit theorems are pivotal for valid statistical inference, to close this gap, we develop a general methodology for deriving distributional limits for $f$-divergences based on the functional delta method and Hadamard directional differentiability. Focusing on four prominent $f$-divergences -- Kullback-Leibler divergence, $\chi^2$ divergence, squared Hellinger distance, and total variation distance -- we identify sufficient conditions on the population distributions for the existence of distributional limits and characterize the limiting variables. These results are used to derive one- and two-sample limit theorems for Gaussian-smoothed $f$-divergences, both under the null and the alternative. Finally, an application of the limit distribution theory to auditing differential privacy is proposed and analyzed for significance level and power against local alternatives.
翻译:以美元计算概率分布的差异,在信息理论、机器学习和统计方面都是无处不在的信息理论、机器学习和统计方面。虽然从数据中估算美元差异值有许多方法,但量化估算误差波动的限值分配理论在很大程度上是模糊的。由于限值对于有效统计推算、缩小这一差距至关重要,我们制定了一个总的方法,用于根据功能三角体方法和哈达马方向差异性来计算美元差异值分配限值。重点是四个显著的美元差异 -- -- Kullback-利伯尔差异、 $\chi2美元差异、平方位海灵格距离和总变异距离 -- -- 我们确定了人口分布的充分条件,以存在分配限值,并确定了限制变量的特点。这些结果用于为高萨-测测偏差的美元-方向差异性差异性取出一至二倍的值分配限值。在无效性和替代性理论下,对地方的变异性分配进行了分析。