Generating a Boltzmann distribution in high dimension has recently been achieved with Normalizing Flows, which enable fast and exact computation of the generated density, and thus unbiased estimation of expectations. However, current implementations rely on accurate training data, which typically comes from computationally expensive simulations. There is therefore a clear incentive to train models with incomplete or no data by relying solely on the target density, which can be obtained from a physical energy model (up to a constant factor). For that purpose, we analyze the properties of standard losses based on Kullback-Leibler divergences. We showcase their limitations, in particular a strong propensity for mode collapse during optimization on high-dimensional distributions. We then propose strategies to alleviate these issues, most importantly a new loss function well-grounded in theory and with suitable optimization properties. Using as a benchmark the generation of 3D molecular configurations, we show on several tasks that, for the first time, imperfect pre-trained models can be further optimized in the absence of training data.
翻译:最近,通过正常化流程,产生了波尔兹曼高维分布,从而能够快速准确地计算生成的密度,从而对预期进行公正的估计。然而,目前的实施依赖精确的培训数据,这些数据通常来自计算成本的模拟。因此,有明显的动机,通过仅仅依靠目标密度来培训不完全或没有数据的模型,而目标密度可以从物理能量模型(直至一个不变系数)中获得。为此,我们根据库尔回背-利伯利尔差异分析标准损失的特性。我们展示了它们的局限性,特别是高维分布优化期间模式崩溃的强烈倾向。我们随后提出了缓解这些问题的战略,最重要的是,在理论和适当优化性能方面,一个新的损失功能。我们用3D分子配置的生成作为基准,我们展示了几项任务,即:在缺乏培训数据的情况下,不完善的预先培训模型第一次可以进一步优化。