Complex scientific models where the likelihood cannot be evaluated present a challenge for statistical inference. Over the past two decades, a wide range of algorithms have been proposed for learning parameters in computationally feasible ways, often under the heading of approximate Bayesian computation or likelihood-free inference. There is, however, no consensus on how to rigorously evaluate the performance of these algorithms. Here, we argue for scoring algorithms by the mean squared error in estimating expectations of functions with respect to the posterior. We show that score implies common alternatives, including the acceptance rate and effective sample size, as limiting special cases. We then derive asymptotically optimal distributions for choosing or sampling discrete or continuous simulation parameters, respectively. Our recommendations differ significantly from guidelines based on alternative scores outside of their region of validity. As an application, we show sequential Monte Carlo in this context can be made more accurate with no new samples by accepting particles from all rounds.
翻译:无法评估这种可能性的复杂科学模型对统计推理提出了挑战。在过去二十年中,提出了多种计算可行的方法学习参数的多种算法,通常在大约贝叶斯计算或无可能性推理的标题下进行。然而,对于如何严格评估这些算法的性能,没有达成共识。在这里,我们主张在估计后台功能的预期值时,采用平均的平方差来评分算法。我们表明,得分意味着共同的替代方法,包括接受率和有效样本大小,以限制特殊情况。然后,我们分别从微乎其微的最佳分布中选择或取样离散或连续的模拟参数。我们的建议与基于其有效区域以外的替代分数的准则有很大不同。作为一种应用,我们通过接受所有回合的粒子来显示连续的蒙特卡洛在这种背景下的评分法会更加准确。