As manipulating images by copy-move, splicing and/or inpainting may lead to misinterpretation of the visual content, detecting these sorts of manipulations is crucial for media forensics. Given the variety of possible attacks on the content, devising a generic method is nontrivial. Current deep learning based methods are promising when training and test data are well aligned, but perform poorly on independent tests. Moreover, due to the absence of authentic test images, their image-level detection specificity is in doubt. The key question is how to design and train a deep neural network capable of learning generalizable features sensitive to manipulations in novel data, whilst specific to prevent false alarms on the authentic. We propose multi-view feature learning to jointly exploit tampering boundary artifacts and the noise view of the input image. As both clues are meant to be semantic-agnostic, the learned features are thus generalizable. For effectively learning from authentic images, we train with multi-scale (pixel / edge / image) supervision. We term the new network MVSS-Net and its enhanced version MVSS-Net++. Experiments are conducted in both within-dataset and cross-dataset scenarios, showing that MVSS-Net++ performs the best, and exhibits better robustness against JPEG compression, Gaussian blur and screenshot based image re-capturing.
翻译:由于通过影印移动、拼凑和(或)涂漆对图像进行操控可能会导致对视觉内容的误解,因此发现这类操纵对于媒体法证至关重要。鉴于对内容可能进行的各种攻击,设计一种通用方法是非三相的。当培训和测试数据对齐时,以深层次学习为基础的方法很有希望,但在独立测试中效果不佳。此外,由于缺乏真实的测试图像,其图像水平的检测特性令人怀疑。关键问题是如何设计和训练一个深层次的神经网络,能够学习对新数据操作具有敏感性的通用特征,同时具体地防止在真实数据上出现虚假的警报。我们建议多视角特征学习,共同利用篡改边界文物和输入图像的噪音视角。由于这两个线索的本意是语义一致的,因此学习的特征是普遍的。为了从真实图像中有效地学习,我们用多种规模(像素/边缘/图像)的监管来培训。我们把新的MVSS-Net网络及其强化版本的版本用于防止图像真实性的警报。我们建议多视角学习多视角学习如何操作。我们把MVSS-Net-Net-Net++的图像用于显示更好的图像的图像图像。在图像中进行实验中,以显示更好的压压压压的图像的图像的图像和升级的图像。