With recent progress in deep generative models, the problem of identifying synthetic data and comparing their underlying generative processes has become an imperative task for various reasons, including fighting visual misinformation and source attribution. Existing methods often approximate the distance between the models via their sample distributions. In this paper, we approach the problem of fingerprinting generative models by learning representations that encode the residual artifacts left by the generative models as unique signals that identify the source models. We consider these unique traces (a.k.a. "artificial fingerprints") as representations of generative models, and demonstrate their usefulness in both the discriminative task of source attribution and the unsupervised task of defining a similarity between the underlying models. We first extend the existing studies on fingerprints of GANs to four representative classes of generative models (VAEs, Flows, GANs and score-based models), and demonstrate their existence and attributability. We then improve the stability and attributability of the fingerprints by proposing a new learning method based on set-encoding and contrastive training. Our set-encoder, unlike existing methods that operate on individual images, learns fingerprints from a \textit{set} of images. We demonstrate improvements in the stability and attributability through comparisons to state-of-the-art fingerprint methods and ablation studies. Further, our method employs contrastive training to learn an implicit similarity between models. We discover latent families of generative models using this metric in a standard hierarchical clustering algorithm.
翻译:最近,随着深层基因模型的进展,由于各种原因,包括打击视觉误差和来源归属,识别合成数据和比较其基本基因变异过程的问题已成为一项紧迫任务,包括打击视觉误差和源归属,现有方法往往通过样本分布来接近模型之间的距离。在本文件中,我们通过学习将基因变异模型留下的残余文物编码为鉴定来源模型的独特信号,来处理指纹变异模型的印迹问题。我们认为这些独特的痕迹(a.k.a.“人工指纹”)是基因变异模型的体现,并表明其在源归别和定义基本模型相似性这一不受监督的任务中的有用性。我们首先将关于GAN的指纹的现有研究扩大到四个具有代表性的基因化模型(VAE、流动、GANs和分数模型),并表明其存在和可归属性。然后,我们将这些指纹的稳定性和可归属性视为一种基于定序和对比培训的新学习方法。我们所设定的基因变异性模型不同于在个人图像的可变化性上的现有方法,我们从一个可变化的直观性研究中学习了一种可辨别方法。