Recent technological advances in synthetic data have enabled the generation of images with such high quality that human beings cannot tell the difference between real-life photographs and Artificial Intelligence (AI) generated images. Given the critical necessity of data reliability and authentication, this article proposes to enhance our ability to recognise AI-generated images through computer vision. Initially, a synthetic dataset is generated that mirrors the ten classes of the already available CIFAR-10 dataset with latent diffusion which provides a contrasting set of images for comparison to real photographs. The model is capable of generating complex visual attributes, such as photorealistic reflections in water. The two sets of data present as a binary classification problem with regard to whether the photograph is real or generated by AI. This study then proposes the use of a Convolutional Neural Network (CNN) to classify the images into two categories; Real or Fake. Following hyperparameter tuning and the training of 36 individual network topologies, the optimal approach could correctly classify the images with 92.98% accuracy. Finally, this study implements explainable AI via Gradient Class Activation Mapping to explore which features within the images are useful for classification. Interpretation reveals interesting concepts within the image, in particular, noting that the actual entity itself does not hold useful information for classification; instead, the model focuses on small visual imperfections in the background of the images. The complete dataset engineered for this study, referred to as the CIFAKE dataset, is made publicly available to the research community for future work.
翻译:机器翻译的摘要:
近年来,合成数据方面的技术进步使得AI生成的图像质量如此之高,以至于人类无法区分真实的照片和AI生成的图像。考虑到数据可靠性和认证的关键性,本文建议通过计算机视觉增强我们识别AI生成的图像的能力。最初,采用潜在扩散生成了一个合成数据集,其对CIFAR-10数据集中现有的十个类别进行了模拟,提供了一组用于与真实照片进行比较的对比图像。该模型能够生成复杂的视觉属性,例如水中逼真的反射。这两组数据呈二进制分类问题,即是真实照片还是AI生成的图像。本研究提出使用卷积神经网络(CNN)将图像分类为两个类别:真实还是虚假。经过超参数调整和训练36个单独的网络拓扑结构,最佳方法可以以92.98%的准确率正确分类图像。最后,本研究通过梯度类激活映射实现可解释AI,探索图像内哪些特征对于分类有用。解释揭示了图像内有趣的概念,特别是注意到实际实体本身对于分类并没有有用信息;相反,模型侧重于图像背景中的小视觉缺陷。为了未来的工作,本研究制定了完整的数据集,并可供研究界公开使用,该数据集称为CIFAKE数据集。