The rapid advancement of speech synthesis and voice conversion technologies has raised significant security concerns in multimedia forensics. Although current detection models demonstrate impressive performance, they struggle to maintain effectiveness against constantly evolving deepfake attacks. Additionally, continually fine-tuning these models using historical training data incurs substantial computational and storage costs. To address these limitations, we propose a novel framework that incorporates Universal Adversarial Perturbation (UAP) into audio deepfake detection, enabling models to retain knowledge of historical spoofing distribution without direct access to past data. Our method integrates UAP seamlessly with pre-trained self-supervised audio models during fine-tuning. Extensive experiments validate the effectiveness of our approach, showcasing its potential as an efficient solution for continual learning in audio deepfake detection.
翻译:语音合成与语音转换技术的快速发展引发了多媒体取证领域的重要安全关切。尽管现有检测模型展现出优异的性能,但在应对不断演进的深度伪造攻击时,其持续有效性面临挑战。此外,利用历史训练数据对模型进行持续微调会产生高昂的计算与存储成本。为克服这些局限,我们提出一种创新框架,将通用对抗扰动(UAP)引入音频深度伪造检测,使模型能够在无需直接访问历史数据的情况下,保持对过往伪造分布的知识记忆。该方法在微调过程中将UAP与预训练的自监督音频模型无缝集成。大量实验验证了本方法的有效性,展现了其作为音频深度伪造检测中持续学习高效解决方案的潜力。