Denoising diffusion probabilistic models (DDPM) are powerful hierarchical latent variable models with remarkable sample generation quality and training stability. These properties can be attributed to parameter sharing in the generative hierarchy, as well as a parameter-free diffusion-based inference procedure. In this paper, we present Few-Shot Diffusion Models (FSDM), a framework for few-shot generation leveraging conditional DDPMs. FSDMs are trained to adapt the generative process conditioned on a small set of images from a given class by aggregating image patch information using a set-based Vision Transformer (ViT). At test time, the model is able to generate samples from previously unseen classes conditioned on as few as 5 samples from that class. We empirically show that FSDM can perform few-shot generation and transfer to new datasets. We benchmark variants of our method on complex vision datasets for few-shot learning and compare to unconditional and conditional DDPM baselines. Additionally, we show how conditioning the model on patch-based input set information improves training convergence.
翻译:低温扩散概率模型(DDPM)是强大的等级潜伏模型,具有显著的样本生成质量和培训稳定性。这些特性可归因于基因等级中的参数共享以及无参数扩散推论程序。本文介绍了少数光谱扩散模型(FSDM),这是利用有条件的DDPMs的微粒一代框架。FSDMS经过培训,通过使用基于设定的视野变异器(VT)集成图像补丁信息,根据特定类别的一小套图像来调整基因变异过程。在测试时,该模型能够从先前看不见的类别中生成样本,但以该类别中只有5个样本为条件。我们从经验上表明,FSDMD能够进行几发式生成并传输到新的数据集。我们将我们的方法在复杂视觉数据集上的变量作为基准,用于少发学习,并与无条件和有条件的DDPM基线进行比较。此外,我们展示了基于补装输入集的信息模型如何调整培训的趋同。