Normalizing flows are tractable density models that can approximate complicated target distributions, e.g. Boltzmann distributions of physical systems. However, current methods for training flows either suffer from mode-seeking behavior, use samples from the target generated beforehand by expensive MCMC simulations, or use stochastic losses that have very high variance. To avoid these problems, we augment flows with annealed importance sampling (AIS) and minimize the mass covering $\alpha$-divergence with $\alpha=2$, which minimizes importance weight variance. Our method, Flow AIS Bootstrap (FAB), uses AIS to generate samples in regions where the flow is a poor approximation of the target, facilitating the discovery of new modes. We target with AIS the minimum variance distribution for the estimation of the $\alpha$-divergence via importance sampling. We also use a prioritized buffer to store and reuse AIS samples. These two features significantly improve FAB's performance. We apply FAB to complex multimodal targets and show that we can approximate them very accurately where previous methods fail. To the best of our knowledge, we are the first to learn the Boltzmann distribution of the alanine dipeptide molecule using only the unnormalized target density and without access to samples generated via Molecular Dynamics (MD) simulations: FAB produces better results than training via maximum likelihood on MD samples while using 100 times fewer target evaluations. After reweighting samples with importance weights, we obtain unbiased histograms of dihedral angles that are almost identical to the ground truth ones.
翻译:标准化流是可移动的密度模型,可以接近复杂的目标分布,例如,Boltzmann 物理系统分布。然而,目前培训流的方法要么是寻求模式的行为,要么是使用昂贵的MCMC模拟活动事先产生的目标样本,或者使用差异非常大的随机损失。为避免这些问题,我们用非负重抽样增加流量,并尽可能减少以美元表示的重量值,以美元表示的重量=2美元表示的重量差,从而最大限度地减少重量差。我们的方法,即流动AIS 诱饵(FAB),使用AIS 方法,在流动接近目标不近目标的区域制作样本,为发现新模式提供便利。我们用AIS 以最低差异分布为目标,通过重要性取样估计美元表示美元表示的重量差值。我们还使用优先缓冲缓冲,储存和再利用AIS 样本。这两个特征大大改善了FAB的性能。我们用FAB 应用了相同的多式联运目标,并表明我们可以非常准确地在先前方法失败的地区采集样本,同时使用最精确的分子比例,我们用Blmat mal 的模型进行最准确的排序,我们首先学习的是通过MDRD的模型,然后才进行不测算。