In this paper we suggest two statistical hypothesis tests for the regression function of binary classification based on conditional kernel mean embeddings. The regression function is a fundamental object in classification as it determines both the Bayes optimal classifier and the misclassification probabilities. A resampling based framework is applied and combined with consistent point estimators for the conditional kernel mean map to construct distribution-free hypothesis tests. These tests are introduced in a flexible manner allowing us to control the exact probability of type I error. We also prove that both proposed techniques are consistent under weak statistical assumptions, i.e., the type II error probabilities pointwise converge to zero.
翻译:在本文中,我们建议对基于有条件内核平均嵌入的二进制分类的回归功能进行两个统计假设测试。回归函数是分类中的一个基本对象,因为它既决定了贝耶斯最佳分类器,又决定了分类错误的概率。采用了一个基于重新抽样的框架,并结合了对有条件内核平均图的一致点估算器,以构建无分布式假设测试。这些测试以灵活的方式引入,使我们能够控制I类错误的确切概率。我们还证明,在薄弱的统计假设下,即第二类错误概率会归零,这两种拟议技术都是一致的。