Denoising diffusion probabilistic models (DDPMs) are a recent family of generative models that achieve state-of-the-art results. In order to obtain class-conditional generation, it was suggested to guide the diffusion process by gradients from a time-dependent classifier. While the idea is theoretically sound, deep learning-based classifiers are infamously susceptible to gradient-based adversarial attacks. Therefore, while traditional classifiers may achieve good accuracy scores, their gradients are possibly unreliable and might hinder the improvement of the generation results. Recent work discovered that adversarially robust classifiers exhibit gradients that are aligned with human perception, and these could better guide a generative process towards semantically meaningful images. We utilize this observation by defining and training a time-dependent adversarially robust classifier and use it as guidance for a generative diffusion model. In experiments on the highly challenging and diverse ImageNet dataset, our scheme introduces significantly more intelligible intermediate gradients, better alignment with theoretical findings, as well as improved generation results under several evaluation metrics. Furthermore, we conduct an opinion survey whose findings indicate that human raters prefer our method's results.
翻译:209. 最近的工作发现,对抗性强的分类师展示了与人类感知相适应的梯度,这些梯度可以更好地引导基因化过程走向具有语义意义的图像。我们利用这一观察方法,确定和培训一个具有时间依赖的、具有敌意的对抗性分类师,并将之作为基因化传播模型的指南。在对极具挑战性和多样性的图像网络数据集的实验中,我们的计划引入了更易理解的中间梯度,更符合理论结论,以及若干评价指标下的改良的一代结果。此外,我们进行了一项意见调查,其结果表明,人类比率者更喜欢我们的方法结果。