The capability of doing effective forensic analysis on printed and scanned (PS) images is essential in many applications. PS documents may be used to conceal the artifacts of images which is due to the synthetic nature of images since these artifacts are typically present in manipulated images and the main artifacts in the synthetic images can be removed after the PS. Due to the appeal of Generative Adversarial Networks (GANs), synthetic face images generated with GANs models are difficult to differentiate from genuine human faces and may be used to create counterfeit identities. Additionally, since GANs models do not account for physiological constraints for generating human faces and their impact on human IRISes, distinguishing genuine from synthetic IRISes in the PS scenario becomes extremely difficult. As a result of the lack of large-scale reference IRIS datasets in the PS scenario, we aim at developing a novel dataset to become a standard for Multimedia Forensics (MFs) investigation which is available at [45]. In this paper, we provide a novel dataset made up of a large number of synthetic and natural printed IRISes taken from VIPPrint Printed and Scanned face images. We extracted irises from face images and it is possible that the model due to eyelid occlusion captured the incomplete irises. To fill the missing pixels of extracted iris, we applied techniques to discover the complex link between the iris images. To highlight the problems involved with the evaluation of the dataset's IRIS images, we conducted a large number of analyses employing Siamese Neural Networks to assess the similarities between genuine and synthetic human IRISes, such as ResNet50, Xception, VGG16, and MobileNet-v2. For instance, using the Xception network, we achieved 56.76\% similarity of IRISes for synthetic images and 92.77% similarity of IRISes for real images.
翻译:有效地进行印刷和扫描(PS)图像的法证分析能力对许多应用至关重要。PS文档可能被用来隐藏由于合成图像的人工制品而存在的图像失真,合成图像的主要制品在PS后可以被移除。由于生成式对抗网络(GAN)的吸引力,GAN模型生成的合成人脸图像很难与真实的人脸区分开来,因此可以用来创建伪造身份。此外,由于GAN模型不考虑生成人脸图像的生理约束及其对人类虹膜的影响,在PS情况下区分真实和合成的虹膜变得极其困难。由于PS情况下缺乏大规模的引用虹膜数据集,我们旨在开发一个新的数据集,成为多媒体马蒂斯调查的标准,该数据集可在[45]上获得。在本文中,我们提供了一个由大量合成和自然打印虹膜组成的新数据集,这些虹膜是从VIPPrint打印和扫描人脸图像中提取的。我们从人脸图像中提取虹膜,由于眼睑遮挡,模型可能捕获到不完整的虹膜,因此我们应用技术来发现虹膜图像之间的复杂链接,以填充提取的虹膜的缺失像素。为了突出评估数据集虹膜图像所涉及的问题,我们使用了许多分析,采用孪生神经网络来评估真实和合成人类虹膜之间的相似性,例如ResNet50、Xception、VGG16和MobileNet-v2。例如,使用Xception网络,我们实现了56.76%的合成图像虹膜相似性和92.77%的真实图像虹膜相似性。