Many areas of science make extensive use of computer simulators that implicitly encode likelihood functions of complex systems. Classical statistical methods are poorly suited for these so-called likelihood-free inference (LFI) settings, particularly outside asymptotic and low-dimensional regimes. Although new machine learning methods, such as normalizing flows, have revolutionized the sample efficiency and capacity of LFI methods, it remains an open question whether they produce confidence sets with correct conditional coverage for small sample sizes. This paper unifies classical statistics with modern machine learning to present (i) a practical procedure for the Neyman construction of confidence sets with finite-sample guarantees of nominal coverage, and (ii) diagnostics that estimate conditional coverage over the entire parameter space. We refer to our framework as likelihood-free frequentist inference (LF2I). Any method that defines a test statistic, like the likelihood ratio, can leverage the LF2I machinery to create valid confidence sets and diagnostics without costly Monte Carlo samples at fixed parameter settings. We study the power of two test statistics (ACORE and BFF), which, respectively, maximize versus integrate an odds function over the parameter space. Our paper discusses the benefits and challenges of LF2I, with a breakdown of the sources of errors in LF2I confidence sets.
翻译:许多科学领域广泛使用计算机模拟器,以隐含的概率函数编码复杂系统。经典统计方法在这些所谓的无似然推断(LFI)环境下不适用,尤其是在渐近和低维情况之外。虽然新的机器学习方法,如规范化流,已经彻底改变了LFI方法的样本效率和容量,但是对于小样本的样本容积率是否正确仍然是一个悬而未决的问题。本文将经典统计与现代机器学习结合起来,提出了一种(i)具有标称覆盖度有限样本保证的置信区间的实际程序,以及(ii)估计整个参数空间上的条件覆盖度的诊断方法。我们将我们的框架称为无似然频率推断(LF2I)。任何定义检验统计量(如似然比)的方法都可以利用LF2I机制创建有效的置信区间和诊断,而无需在固定参数设置下进行昂贵的蒙特卡罗抽样。我们研究了两个检验统计量(ACORE和BFF)的功率,分别在参数空间内最大化或积分一个赔率函数。我们的文章讨论了LF2I的好处和挑战,以及LF2I置信区间中误差来源的分析。
P.S. The translation is generated by AI, so there may be errors.