Given a probability distribution $μ$ in $\mathbb{R}^d$ represented by data, we study in this paper the generative modeling of its conditional probability distributions on the level-sets of a collective variable $ξ: \mathbb{R}^d \rightarrow \mathbb{R}^k$, where $1 \le k<d$. We propose a general and effcient learning approach that is able to learn generative models on different level-sets of $ξ$ simultaneously. To improve the learning quality on level-sets in low-probability regions, we also propose a strategy for data enrichment by utilizing data from enhanced sampling techniques. We demonstrate the effectiveness of our proposed learning approach through concrete numerical examples. The proposed approach is potentially useful for the generative modeling of molecular systems in biophysics, for instance.
翻译:给定由数据表示的 $\mathbb{R}^d$ 中的概率分布 $μ$,本文研究其在集体变量 $ξ: \mathbb{R}^d \rightarrow \mathbb{R}^k$(其中 $1 \le k<d$)水平集上的条件概率分布的生成建模。我们提出了一种通用且高效的学习方法,能够同时学习 $ξ$ 不同水平集上的生成模型。为提升低概率区域水平集上的学习质量,我们还提出了一种利用增强采样技术数据进行数据富集的策略。通过具体数值算例,我们验证了所提学习方法的有效性。该方法在生物物理学等领域(例如分子系统的生成建模)中具有潜在应用价值。