We characterize the asymptotic performance of nonparametric one- and two-sample testing. The exponential decay rate or error exponent of the type-II error probability is used as the asymptotic performance metric, and an optimal test achieves the maximum rate subject to a constant level constraint on the type-I error probability. With Sanov's theorem, we derive a sufficient condition for one-sample tests to achieve the optimal error exponent in the universal setting, i.e., for any distribution defining the alternative hypothesis. We then show that two classes of Maximum Mean Discrepancy (MMD) based tests attain the optimal type-II error exponent on $\mathbb R^d$, while the quadratic-time Kernel Stein Discrepancy (KSD) based tests achieve this optimality with an asymptotic level constraint. For general two-sample testing, however, Sanov's theorem is insufficient to obtain a similar sufficient condition. We proceed to establish an extended version of Sanov's theorem and derive an exact error exponent for the quadratic-time MMD based two-sample tests. The obtained error exponent is further shown to be optimal among all two-sample tests satisfying a given level constraint. Our work hence provides an achievability result for optimal nonparametric one- and two-sample testing in the universal setting. Application to off-line change detection and related issues are also discussed.
翻译:我们把非参数一和二样测试的性能定性为非参数一和二样测试的表面性能。二类误差概率指数衰减率或误差推算率用作无源性性性能衡量标准,而最佳测试则在对类型一误差概率的常数限制下达到最大率。根据 Sanov 的定理,我们为一次性测试得出一个充分的条件,以便在通用环境下实现最佳误差,即用于定义替代假设的任何分布。然后,我们显示,基于二类误差的指数衰减率或误差率测试在以$\mathbbb R ⁇ d$表示的最佳二类误差率衡量标准,而基于二次误差测试的Kernel Stein Exmission(KSDD) 测试则以无源性能水平限制达到这一最佳性能。然而,对于一般的二样测试来说,Sanov 的原性能不足以获得类似的充分条件。我们着手在二类中建立扩展版Sanov 的偏差值测试,并在两次误差测试中作出精确的测试,因此显示一个最佳性测试。