The maximum mean discrepancy (MMD) is a kernel-based distance between probability distributions useful in many applications (Gretton et al. 2012), bearing a simple estimator with pleasing computational and statistical properties. Being able to efficiently estimate the variance of this estimator is very helpful to various problems in two-sample testing. Towards this end, Bounliphone et al. (2016) used the theory of U-statistics to derive estimators for the variance of an MMD estimator, and differences between two such estimators. Their estimator, however, drops lower-order terms, and is unnecessarily biased. We show in this note - extending and correcting work of Sutherland et al. (2017) - that we can find a truly unbiased estimator for the actual variance of both the squared MMD estimator and the difference of two correlated squared MMD estimators, at essentially no additional computational cost.
翻译:最大平均差异(MMD)是许多应用软件(Gretton等人,2012年)中可用概率分布之间的内核差(内核差),带有简单的估算器,具有令人愉快的计算和统计属性。能够高效估计这个估算器的差异,对两次抽样测试中的各种问题非常有帮助。为此,Bounliphone等人(2016年)利用U-统计学理论得出MMD估计器差异的估算器,以及两个这样的估测器之间的差异。但是,它们的估测器下降了较低顺序的术语,而且不必要地带有偏差。我们在本说明中显示——萨瑟兰等人(2017年)的扩展和纠正工作,我们可以找到一个真正公正的估算器,以计算平方MD估计器和两个相对立的MD MD估计器的实际差异,基本上没有额外的计算成本。