Score-based models generate samples by mapping noise to data (and vice versa) via a high-dimensional diffusion process. We question whether it is necessary to run this entire process at high dimensionality and incur all the inconveniences thereof. Instead, we restrict the diffusion via projections onto subspaces as the data distribution evolves toward noise. When applied to state-of-the-art models, our framework simultaneously improves sample quality -- reaching an FID of 2.17 on unconditional CIFAR-10 -- and reduces the computational cost of inference for the same number of denoising steps. Our framework is fully compatible with continuous-time diffusion and retains its flexible capabilities, including exact log-likelihoods and controllable generation. Code is available at https://github.com/bjing2016/subspace-diffusion.
翻译:计分模型通过高维扩散过程对数据(反之亦然)的噪音进行测绘,生成样本。我们质疑是否有必要以高维度运行整个过程,并由此造成所有不便。相反,随着数据分布向噪音发展,我们限制通过投射到子空间的传播。当应用到最先进的模型时,我们的框架同时提高样本质量 -- -- 达到关于无条件的CIFAR-10的2.17的FID标准 -- -- 并降低相同数量的去注步骤的计算推论成本。我们的框架与连续时间扩散完全兼容,并保留其灵活能力,包括精确的原木相似性和可控制的一代。代码可在https://github.com/bjing2016/subspace-difulation查阅。