We consider the problem of generative modeling based on smoothing an unknown density of interest in $\mathbb{R}^d$ using factorial kernels with $M$ independent Gaussian channels with equal noise levels introduced by Saremi and Srivastava (2022). First, we fully characterize the time complexity of learning the resulting smoothed density in $\mathbb{R}^{Md}$, called M-density, by deriving a universal form for its parametrization in which the score function is by construction permutation equivariant. Next, we study the time complexity of sampling an M-density by analyzing its condition number for Gaussian distributions. This spectral analysis gives a geometric insight on the "shape" of M-densities as one increases $M$. Finally, we present results on the sample quality in this class of generative models on the CIFAR-10 dataset where we report Fr\'echet inception distances (14.15), notably obtained with a single noise level on long-run fast-mixing MCMC chains.
翻译:我们考虑使用 $M$ 个独立的具有相等噪声水平的因子核对未知密度进行平滑化来进行生成建模的问题,该方法由 Saremi 和 Srivastava (2022) 提出。首先,我们通过导出一个参数化的通用形式来完全描述学习在 $\mathbb{R}^{Md}$ 中的平滑化密度的时间复杂度,该密度称为 M-密度,因为其得分函数在构造过程中具有置换不变性。接下来,我们通过分析高斯分布的条件数来研究采样 M-密度的时间复杂度。这种谱分析提供了一个几何视角来增加 $M$ 时,M-密度的“形状”。最后,我们在 CIFAR-10 数据集中介绍了这类生成模型的样本质量结果,其中我们在长期快速混合的 MCMC 链上使用单个噪声水平得出了 Fr\'echet 入职距离值(14.15)。