Inferring the input parameters of simulators from observations is a crucial challenge with applications from epidemiology to molecular dynamics. Here we show a simple approach in the regime of sparse data and approximately correct models, which is common when trying to use an existing model to infer latent variables with observed data. This approach is based on the principle of maximum entropy (MaxEnt) and provably makes the smallest change in the latent joint distribution to fit new data. This method requires no likelihood or model derivatives and its fit is insensitive to prior strength, removing the need to balance observed data fit with prior belief. The method requires the ansatz that data is fit in expectation, which is true in some settings and may be reasonable in all with few data points. The method is based on sample reweighting, so its asymptotic run time is independent of prior distribution dimension. We demonstrate this MaxEnt approach and compare with other likelihood-free inference methods across three systems: a point particle moving in a gravitational field, a compartmental model of epidemic spread and finally molecular dynamics simulation of a protein.
翻译:从观测中推断模拟器的输入参数是一项关键的挑战,从流行病学到分子动态的应用都是如此。在这里,我们展示了一种简单的方法,即数据稀少和模型基本正确,在试图使用现有模型用观察到的数据推断潜在变量时,这是常见的。这种方法基于最大催化(MAxEnt)原则,可以想象地使潜在联合分布的变化最小,以适应新的数据。这种方法不需要可能性或模型衍生物,其适合性与先前的强度不相适应,从而消除了平衡观测到的与先前的信念相适应的数据的需要。这种方法要求使用符合预期的安萨兹,而在某些环境中,这是事实,在全部数据点上都是合理的。这种方法基于样本重加权,因此其无症状运行时间独立于先前的分布层面。我们演示了这一MaxEnt方法,并与三个系统的其他无概率的推断方法进行了比较:在重力场移动的点粒子、流行病扩散的分形模型和蛋白质的最后分子动态模拟。