The Maximum Mean Discrepancy (MMD) is a widely used multivariate distance metric for two-sample testing. The standard MMD test statistic has an intractable null distribution typically requiring costly resampling or permutation approaches for calibration. In this work we leverage a martingale interpretation of the estimated squared MMD to propose martingale MMD (mMMD), a quadratic-time statistic which has a limiting standard Gaussian distribution under the null. Moreover we show that the test is consistent against any fixed alternative and for large sample sizes, mMMD offers substantial computational savings over the standard MMD test, with only a minor loss in power.
翻译:最大均值差异(MMD)是一种广泛用于双样本检验的多元距离度量。标准MMD检验统计量的零分布难以解析处理,通常需要昂贵的重采样或置换方法进行校准。本研究利用平方MMD估计量的鞅解释,提出鞅MMD(mMMD)——一种具有二次时间复杂度的统计量,其在零假设下具有渐近标准高斯分布。此外,我们证明该检验对任意固定备择假设具有一致性,并且在大样本量下,mMMD在仅损失少量检验功效的同时,相比标准MMD检验能显著节省计算成本。