There has been a recent surge in adversarial attacks on deep learning based automatic speech recognition (ASR) systems. These attacks pose new challenges to deep learning security and have raised significant concerns in deploying ASR systems in safety-critical applications. In this work, we introduce WaveGuard: a framework for detecting adversarial inputs that are crafted to attack ASR systems. Our framework incorporates audio transformation functions and analyses the ASR transcriptions of the original and transformed audio to detect adversarial inputs. We demonstrate that our defense framework is able to reliably detect adversarial examples constructed by four recent audio adversarial attacks, with a variety of audio transformation functions. With careful regard for best practices in defense evaluations, we analyze our proposed defense and its strength to withstand adaptive and robust attacks in the audio domain. We empirically demonstrate that audio transformations that recover audio from perceptually informed representations can lead to a strong defense that is robust against an adaptive adversary even in a complete white-box setting. Furthermore, WaveGuard can be used out-of-the box and integrated directly with any ASR model to efficiently detect audio adversarial examples, without the need for model retraining.
翻译:最近,对基于深学习的自动语音识别系统(ASR)的对抗性攻击激增。这些攻击给深学习安全带来了新的挑战,并提出了在安全关键应用中部署ASR系统的重大关切。在这项工作中,我们引入了WaveGuard:一个用于检测用于攻击ASR系统的对抗性投入的框架。我们的框架包含音效转换功能,并分析了原声和变换音频的ASR转录,以检测对抗性投入。我们证明我们的防御框架能够可靠地检测由最近四次有各种音频转换功能的对立性攻击所构建的对抗性例子。我们仔细研究防御评价的最佳做法,分析我们拟议的防御及其力量,以抵御音频领域的适应性和强势攻击。我们从经验上表明,从感知性知情的表达中恢复音力的音力转换可以导致强有力的防御,即使是在完整的白箱设置中,也能够对适应性对适应性对敌力进行强大防御。此外,WaveGuard可以在盒子外直接与任何ASR模型结合,以有效探测声对抗性对立性例子,而无需进行模型再培训。