Recently, diffusion model have demonstrated impressive image generation performances, and have been extensively studied in various computer vision tasks. Unfortunately, training and evaluating diffusion models consume a lot of time and computational resources. To address this problem, here we present a novel pyramidal diffusion model that can generate high resolution images starting from much coarser resolution images using a {\em single} score function trained with a positional embedding. This enables a neural network to be much lighter and also enables time-efficient image generation without compromising its performances. Furthermore, we show that the proposed approach can be also efficiently used for multi-scale super-resolution problem using a single score function.
翻译:最近,传播模型展示了令人印象深刻的图像生成性能,并在各种计算机愿景任务中进行了广泛研究。不幸的是,培训和评估传播模型耗费了大量的时间和计算资源。为了解决这一问题,我们在这里展示了一个新的金字塔传播模型,它能够产生高分辨率图像,其起点是使用一个受过定位嵌入训练的分数函数,使用粗粗度分辨率图像。这使得神经网络能够更轻一些,还能在不损害其性能的情况下实现时间效率高的图像生成。此外,我们表明,拟议的方法也可以使用一个单分函数,有效地用于解决多尺度超分辨率问题。