While diffusion models have shown great success in image generation, their noise-inverting generative process does not explicitly consider the structure of images, such as their inherent multi-scale nature. Inspired by diffusion models and the desirability of coarse-to-fine modelling, we propose a new model that generates images through iteratively inverting the heat equation, a PDE that locally erases fine-scale information when run over the 2D plane of the image. In our novel methodology, the solution of the forward heat equation is interpreted as a variational approximation in a directed graphical model. We demonstrate promising image quality and point out emergent qualitative properties not seen in diffusion models, such as disentanglement of overall colour and shape in images and aspects of neural network interpretability. Spectral analysis on natural images positions our model as a type of dual to diffusion models and reveals implicit inductive biases in them.
翻译:虽然扩散模型在图像生成方面表现出巨大的成功,但其噪声反转基因过程并未明确考虑到图像的结构,例如其固有的多尺度性质。受扩散模型和粗质到软质建模的可取性所启发,我们提议了一种新的模型,通过热方程反复翻转生成图像,一个本地在图像的2D平面上运行时消除微量信息的PDE。在我们的新方法中,远热方程的解决方案被解释为定向图形模型中的变异近似。我们展示了有希望的图像质量,并指出了在扩散模型中看不到的、在图像和神经网络可解释性方面整体颜色和形状的分解。关于自然图像位置的光谱分析,我们的模型是一种双向扩散模型,揭示了这些模型中隐含的感性偏差。