PTW：基于关键调整的预训练图像生成器水印技术 (PTW: Pivotal Tuning Watermarking for Pre-Trained Image Generators)

Deepfakes refer to content synthesized using deep generators, which, when \emph{misused}, have the potential to erode trust in digital media. Synthesizing high-quality deepfakes requires access to large and complex generators only few entities can train and provide. The threat are malicious users that exploit access to the provided model and generate harmful deepfakes without risking detection. Watermarking makes deepfakes detectable by embedding an identifiable code into the generator that is later extractable from its generated images. We propose Pivotal Tuning Watermarking (PTW), a method for watermarking pre-trained generators (i) three orders of magnitude faster than watermarking from scratch and (ii) without the need for any training data. We improve existing watermarking methods and scale to generators $4 \times$ larger than related work. PTW can embed longer codes than existing methods while better preserving the generator's image quality. We propose rigorous, game-based definitions for robustness and undetectability and our study reveals that watermarking is not robust against an adaptive white-box attacker who has control over the generator's parameters. We propose an adaptive attack that can successfully remove any watermarking with access to only $200$ non-watermarked images. Our work challenges the trustworthiness of watermarking for deepfake detection when the parameters of a generator are available.

翻译：深度伪造技术指使用深度生成器合成的内容，当其被错误使用时，可能会破坏对数字媒体的信任。合成高质量的深度伪造需要访问仅少数实体能够培训并提供的大型复杂生成器。威胁是恶意用户利用提供的模型并生成有害的深度伪造，而不会被发现。水印技术可以使深度伪造可检测，方法是将可识别的代码嵌入生成器中，稍后可从其生成的图像中提取。我们提出了基于关键调整的水印技术（PTW），用于预训练生成器的水印，比从头开始的水印技术快三个数量级，而且不需要任何训练数据。我们改进了现有的水印技术，扩展到比相关工作大4倍的生成器。PTW可以嵌入比现有方法更长的代码，同时更好地保留生成器的图像质量。我们提出了严谨的、基于游戏的稳健性和不可检测性定义。我们的研究表明，在生成器的参数可用的情况下，水印技术不具有稳健性。我们提出了一种适应性攻击，可以成功地从只有200个非水印图像的情况下消除任何水印技术。我们的研究挑战了当生成器的参数可用时，因水印技术而产生的深度伪造检测的可信度。

相关内容

生成器

关注 2

生成器是一次生成一个值的特殊类型函数。可以将其视为可恢复函数。调用该函数将返回一个可用于生成连续 x 值的生成【Generator】，简单的说就是在函数的执行过程中，yield语句会把你需要的值返回给调用生成器的地方，然后退出函数，下一次调用生成器函数的时候又从上次中断的地方开始执行，而生成器内的所有变量参数都会被保存下来供下一次使用。

ChatGPT等AIGC如何移动边缘部署？南洋理工最新《在移动网络中释放边云生成AI的力量:AIGC服务》综述其技术体系

专知会员服务

95+阅读 · 2023年3月30日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日