The interest of the machine learning community in image synthesis has grown significantly in recent years, with the introduction of a wide range of deep generative models and means for training them. In this work, we propose a general model-agnostic technique for improving the image quality and the distribution fidelity of generated images obtained by any generative model. Our method, termed BIGRoC (Boosting Image Generation via a Robust Classifier), is based on a post-processing procedure via the guidance of a given robust classifier and without a need for additional training of the generative model. Given a synthesized image, we propose to update it through projected gradient steps over the robust classifier to refine its recognition. We demonstrate this post-processing algorithm on various image synthesis methods and show a significant quantitative and qualitative improvement on CIFAR-10 and ImageNet. Surprisingly, although BIGRoC is the first model agnostic among refinement approaches and requires much less information, it outperforms competitive methods. Specifically, BIGRoC improves the image synthesis best performing diffusion model on ImageNet 128x128 by 14.81%, attaining an FID score of 2.53, and on 256x256 by 7.87%, achieving an FID of 3.63. Moreover, we conduct an opinion survey, according to which humans significantly prefer our method's outputs.
翻译:近些年来,机器学习界对图像合成的兴趣有了显著的提高,引进了各种深厚的基因化模型和培训方法。在这项工作中,我们提出一种一般的模型 -- -- 不可知性技术,以提高通过任何基因化模型获得的图像的图像质量和分布真实性。我们的方法称为BIGRoC(通过强力分级器进行图像生成),它基于一个后处理程序,通过一个特定强力分类师的指导,无需对基因化模型进行进一步的培训。鉴于一个综合图像,我们提议通过预测的梯度步骤,超越强力分类器来更新它,以改进其认知度。我们用各种图像合成方法展示了这种后处理算法,并展示了在数量和质量上显著改进通过任何基因化模型获得的图像质量。令人惊讶的是,虽然BIGRoC是改进方法中的第一个模型,需要的信息要少得多,但是它比竞争性方法要差得多。具体地说,BIGRoC改进了图像合成模型在图像网络128x128-128%上的最佳传播模型,以14.81%的速度改进了它。我们更喜欢在各种图像合成方法上达到2.53级,我们更喜欢采用256x5的FID方法。和256xxxx256的进度。