Goodness-of-fit (GoF) tests are fundamental for assessing model adequacy. Score-based tests are appealing because they require fitting the model only once under the null. However, extending them to powerful nonparametric alternatives is difficult due to the lack of suitable score functions. Through a class of exponentially tilted models, we show that the resulting score-based GoF tests are equivalent to the tests based on integral probability metrics (IPMs) indexed by a function class. When the class is rich, the test is universally consistent. This simple yet insightful perspective enables reinterpretation of classical distance-based testing procedures-including those based on Kolmogorov-Smirnov distance, Wasserstein-1 distance, and maximum mean discrepancy-as arising from score-based constructions. Building on this insight, we propose a new nonparametric score-based GoF test through a special class of IPM induced by kernelized Stein's function class, called semiparametric kernelized Stein discrepancy (SKSD) test. Compared with other nonparametric score-based tests, the SKSD test is computationally efficient and accommodates general nuisance-parameter estimators, supported by a generic parametric bootstrap procedure. The SKSD test is universally consistent and attains Pitman efficiency. Moreover, SKSD test provides simple GoF tests for models with intractable likelihoods but tractable scores with the help of Stein's identity and we use two popular models, kernel exponential family and conditional Gaussian models, to illustrate the power of our method. Our method achieves power comparable to task-specific normality tests such as Anderson-Darling and Lilliefors, despite being designed for general nonparametric alternatives.
翻译:拟合优度检验是评估模型充分性的基础方法。基于得分的检验方法因其仅需在原假设下拟合模型一次而备受青睐。然而,由于缺乏合适的得分函数,将其推广至强大的非参数替代模型存在困难。通过一类指数倾斜模型,我们证明由此产生的基于得分的拟合优度检验等价于基于函数类索引的积分概率度量的检验。当函数类足够丰富时,该检验具有普遍一致性。这一简洁而深刻的视角使得经典基于距离的检验程序——包括基于Kolmogorov-Smirnov距离、Wasserstein-1距离和最大均值差异的检验——能够被重新解释为源于基于得分的构造。基于此洞见,我们通过由核化Stein函数类诱导的特殊积分概率度量类,提出了一种新的非参数基于得分拟合优度检验,称为半参数核化Stein差异检验。与其他非参数基于得分的检验相比,SKSD检验计算高效且能兼容一般的冗余参数估计器,并得到通用参数自助程序的支持。SKSD检验具有普遍一致性并达到Pitman效率。此外,借助Stein恒等式,SKSD检验为具有难处理似然函数但可处理得分函数的模型提供了简洁的拟合优度检验方案。我们使用核指数族和条件高斯模型这两个经典模型来展示本方法的效力。尽管本方法是为通用非参数替代模型设计的,但其检验效能可与Anderson-Darling和Lilliefors等针对特定正态性检验任务的方法相媲美。