As neural networks become able to generate realistic artificial images, they have the potential to improve movies, music, video games and make the internet an even more creative and inspiring place. Yet, at the same time, the latest technology potentially enables new digital ways to lie. In response, the need for a diverse and reliable method toolbox arises to identify artificial images and other content. Previous work primarily relies on pixel-space CNN or the Fourier transform. To the best of our knowledge, synthesized fake image analysis and detection methods based on a multi-scale wavelet representation, which is localized in both space and frequency, have been absent thus far. This paper proposes to learn a model for the detection of synthetic images based on the wavelet-packet representation of natural and GAN-generated images. We evaluate our method on FFHQ, CelebA, and LSUN source identification problems and find improved or competitive performance. Our forensic classifier has a small network size and can be learned efficiently. Furthermore, a comparison of the wavelet coefficients from these two sources of images allows an interpretation and identifies significant differences.
翻译:随着神经网络能够产生现实的人工图像,它们有可能改进电影、音乐、视频游戏,并使互联网成为更具创造性和启发性的地方。与此同时,最新技术有可能使新的数字方法产生谎言。作为回应,需要一种多样和可靠的方法工具箱来识别人工图像和其他内容。以前的工作主要依靠像素空间CN或Fourier变异。我们最了解的最好情况是,迄今没有基于多比例波段的图像代表制的合成假图像分析和探测方法,而波段在空间和频率上都是本地的。本文提议学习一种模型,用以根据自然图像和GAN生成图像的波盘-波盘代表制来探测合成图像。我们评估了我们关于FFHQ、CelebA和LSUN的来源识别问题的方法,并发现改进或竞争性能。我们的法证分类器的网络规模较小,可以有效地学习。此外,比较这两个图像来源的波子系数可以进行解释和识别重大差异。