Over the past six years, deep generative models have achieved a qualitatively new level of performance. Generated data has become difficult, if not impossible, to be distinguished from real data. While there are plenty of use cases that benefit from this technology, there are also strong concerns on how this new technology can be misused to spoof sensors, generate deep fakes, and enable misinformation at scale. Unfortunately, current deep fake detection methods are not sustainable, as the gap between real and fake continues to close. In contrast, our work enables a responsible disclosure of such state-of-the-art generative models, that allows researchers and companies to fingerprint their models, so that the generated samples containing a fingerprint can be accurately detected and attributed to a source. Our technique achieves this by an efficient and scalable ad-hoc generation of a large population of models with distinct fingerprints. Our recommended operation point uses a 128-bit fingerprint which in principle results in more than $10^{36}$ identifiable models. Experiments show that our method fulfills key properties of a fingerprinting mechanism and achieves effectiveness in deep fake detection and attribution.
翻译:在过去六年中,深层基因化模型达到了质量上的新性能水平; 生成的数据即使不是不可能,也很难与实际数据区分; 虽然有许多使用案例从这种技术中受益,但人们也非常担心这种新技术如何被滥用于传感器,产生深层假冒,并造成大规模的错误信息; 不幸的是,目前的深层假发现方法是不可持续的,因为真实和假冒之间的差距在继续缩小; 相反,我们的工作使得能够负责任地披露这种最先进的基因化模型,使研究人员和公司能够对其模型进行指纹鉴定,从而能够准确检测并归属于一个来源; 我们的技术通过高效和可扩缩的生成大量具有不同指纹的模型来实现这一目标; 我们推荐的操作点使用128位指纹,原则上可以得出10 ⁇ 36美元以上的可识别模型; 实验表明,我们的方法满足了指纹鉴定机制的关键特性,并在深层的假发现和归属方面取得成效。