具有噪音和昂贵可能性的模型群集引引法 (Ensemble Inference Methods for Models With Noisy and Expensive Likelihoods)

The increasing availability of data presents an opportunity to calibrate unknown parameters which appear in complex models of phenomena in the biomedical, physical and social sciences. However, model complexity often leads to parameter-to-data maps which are expensive to evaluate and are only available through noisy approximations. This paper is concerned with the use of interacting particle systems for the solution of the resulting inverse problems for parameters. Of particular interest is the case where the available forward model evaluations are subject to rapid fluctuations, in parameter space, superimposed on the smoothly varying large scale parametric structure of interest. Multiscale analysis is used to study the behaviour of interacting particle system algorithms when such rapid fluctuations, which we refer to as noise, pollute the large scale parametric dependence of the parameter-to-data map. Ensemble Kalman methods (which are derivative-free) and Langevin-based methods (which use the derivative of the parameter-to-data map) are compared in this light. The ensemble Kalman methods are shown to behave favourably in the presence of noise in the parameter-to-data map, whereas Langevin methods are adversely affected. On the other hand, Langevin methods have the correct equilibrium distribution in the setting of noise-free forward models, whilst ensemble Kalman methods only provide an uncontrolled approximation, except in the linear case. Therefore a new class of algorithms, ensemble Gaussian process samplers, which combine the benefits of both ensemble Kalman and Langevin methods, are introduced and shown to perform favourably.

翻译：越来越多的数据提供情况为校准生物医学、物理科学和社会科学中复杂的现象模型中出现的未知参数提供了机会。然而,模型的复杂性往往导致参数到数据图,这些参数到数据图的评估费用昂贵,只能通过噪音近似点才能得到。本文涉及使用互动粒子系统来解决由此产生的参数问题。特别令人感兴趣的是,现有的远期模型评价在参数空间中受到快速波动的影响,这些变化在变化不定的大规模利差结构上出现。多尺度分析用于研究交互式粒子系统算法的行为,当这种快速波动时,我们称之为噪音,污染参数到数据图的大规模准参数依赖性。Ensemble Kalman方法(无衍生物)和基于Langevin方法(使用参数到数据图的衍生物)在此范围内进行比较。在参数到数据图中出现噪音时,Tembleble Kalman方法显示,而Langevin方法则受到不利影响。在另一个手头,Langevin方法中, 兰氏-rolorimal 方法提供了一种无偏向模型的正统性, 提供了一种无偏向模型, 提供了一种无偏向模型,而正向性分析方法, 提供了一种正向模型的正向模型的正向分配方法。