We investigate properties of goodness-of-fit tests based on the Kernel Stein Discrepancy (KSD). We introduce a strategy to construct a test, called KSDAgg, which aggregates multiple tests with different kernels. KSDAgg avoids splitting the data to perform kernel selection (which leads to a loss in test power), and rather maximises the test power over a collection of kernels. We provide non-asymptotic guarantees on the power of KSDAgg: we show it achieves the smallest uniform separation rate of the collection, up to a logarithmic term. For compactly supported densities with bounded model score function, we derive the rate for KSDAgg over restricted Sobolev balls; this rate corresponds to the minimax optimal rate over unrestricted Sobolev balls, up to an iterated logarithmic term. KSDAgg can be computed exactly in practice as it relies either on a parametric bootstrap or on a wild bootstrap to estimate the quantiles and the level corrections. In particular, for the crucial choice of bandwidth of a fixed kernel, it avoids resorting to arbitrary heuristics (such as median or standard deviation) or to data splitting. We find on both synthetic and real-world data that KSDAgg outperforms other state-of-the-art quadratic-time adaptive KSD-based goodness-of-fit testing procedures.
翻译:我们根据Kernel Stein Discency (KSD) 调查了基于“Kernel Stein Stein Dismission” (KSD) 的最佳测试的特性。 我们引入了一个构建测试的战略,称为KSDAgg(KSDAgg), 将多个测试与不同的内核合并在一起。 KSDAgg(KSDAgg) 避免将数据分解以进行内核选择( 导致测试力的丧失 ), 而是将内核集合的测试力最大化。 我们为KSDADAgg的力量提供非现成的保障: 我们显示它达到了收藏最小的统一分离率, 直至一个对数术语。 对于严格支持的嵌入式模型评分功能, 我们用限制的SDADAgg( KSDAgg) 计算出内核组合的多重测试率; 这个比率相当于不受限制的Sobolev球的微缩最大最佳比率, 直至一个循环的逻辑术语。 KSDgg(Stagg) 可以精确地计算, 因为它依靠一个对等式的靴式靴式的靴测测测测测测算, 和水平。