Detecting deepfakes remains an open problem. Current detection methods fail against an adversary who adds imperceptible adversarial perturbations to the deepfake to evade detection. We propose Disjoint Deepfake Detection (D3), a deepfake detector designed to improve adversarial robustness beyond de facto solutions such as adversarial training. D3 uses an ensemble of models over disjoint subsets of the frequency spectrum to significantly improve robustness. Our key insight is to leverage a redundancy in the frequency domain and apply a saliency partitioning technique to disjointly distribute frequency components across multiple models. We formally prove that these disjoint ensembles lead to a reduction in the dimensionality of the input subspace where adversarial deepfakes lie. We then empirically validate the D3 method against white-box attacks and black-box attacks and find that D3 significantly outperforms existing state-of-the-art defenses applied to deepfake detection.
翻译:检测深假仍然是一个尚未解决的问题 。 当前检测方法对在深假中添加不可察觉的对抗干扰以逃避检测的对手失败 。 我们提议Dismit Deepfake 探测 (D3), 一种深假探测器, 目的是在对抗训练等事实上的解决办法之外提高对抗性强度 。 D3 使用一系列模型, 取代频率频谱中分解子集, 以大大增强强力 。 我们的关键洞察力是利用频率领域的冗余力, 并应用显著的分层技术, 将频率组件分散到多个模型中。 我们正式证明, 这些脱联共共共导致对抗性深假的输入子空间的维度下降 。 然后, 我们从经验上验证了D3 对抗白箱攻击和黑盒攻击的D3 方法, 并发现D3 明显地超越了用于深假探测的现有状态防御。