Deepfakes are computer manipulated videos where the face of an individual has been replaced with that of another. Software for creating such forgeries is easy to use and ever more popular, causing serious threats to personal reputation and public security. The quality of classifiers for detecting deepfakes has improved with the releasing of ever larger datasets, but the understanding of why a particular video has been labelled as fake has not kept pace. In this work we develop, extend and compare white-box, black-box and model-specific techniques for explaining the labelling of real and fake videos. In particular, we adapt SHAP, GradCAM and self-attention models to the task of explaining the predictions of state-of-the-art detectors based on EfficientNet, trained on the Deepfake Detection Challenge (DFDC) dataset. We compare the obtained explanations, proposing metrics to quantify their visual features and desirable characteristics, and also perform a user survey collecting users' opinions regarding the usefulness of the explainers.
翻译:深假是计算机操纵的视频,其面部被另一个人所取代。制作这种伪造的软件很容易使用,而且越来越受欢迎,对个人名誉和公共安全造成严重威胁。随着发布越来越多的数据集,用于探测深假的分类人员的质量有所提高,但是对为什么某个视频被贴上假名的理解没有跟上步伐。在这项工作中,我们开发、扩展和比较白箱、黑盒和用于解释真实和假录像标签的模型技术。特别是,我们调整了SHAP、GradCAM和自我注意模型,以解释基于高效网络的、关于深假探测挑战数据集(DFDC)的先进探测器的预测。我们比较了所获得的解释,提出了量化其视觉特征和理想特征的衡量标准,并进行了用户调查,收集用户对解释器的用处的意见。