A targeted adversarial attack produces audio samples that can force an Automatic Speech Recognition (ASR) system to output attacker-chosen text. To exploit ASR models in real-world, black-box settings, an adversary can leverage the transferability property, i.e. that an adversarial sample produced for a proxy ASR can also fool a different remote ASR. However recent work has shown that transferability against large ASR models is very difficult. In this work, we show that modern ASR architectures, specifically ones based on Self-Supervised Learning, are in fact vulnerable to transferability. We successfully demonstrate this phenomenon by evaluating state-of-the-art self-supervised ASR models like Wav2Vec2, HuBERT, Data2Vec and WavLM. We show that with low-level additive noise achieving a 30dB Signal-Noise Ratio, we can achieve target transferability with up to 80% accuracy. Next, we 1) use an ablation study to show that Self-Supervised learning is the main cause of that phenomenon, and 2) we provide an explanation for this phenomenon. Through this we show that modern ASR architectures are uniquely vulnerable to adversarial security threats.
翻译:定向对抗性攻击产生声音样本,可以迫使自动语音识别系统(ASR)输出攻击者选择的文本。为了在现实世界、黑盒设置中利用ASR模型,对手可以利用可转移性财产,即为代理ASR制作的对抗性样本也可以愚弄不同的远程ASR。然而,最近的工作表明,针对大型ASR模型的可转移性非常困难。在这项工作中,我们显示现代ASR结构,特别是基于自我强化学习的自动语音识别系统,事实上容易被传输。我们通过评价最新自我监督的ASR模型,如Wav2Vec2、HuBERT、Dat2Vec和WavLM等,成功展示了这一现象。我们展示了低级添加噪音,达到30dB信号-噪音比率,我们可以达到高达80%的精确度目标可转移性。接下来,我们使用自我强化研究,以显示自我智能学习是该现象的主要原因,我们为这种对抗性结构提供了一种脆弱的解释。我们通过这种方式展示了现代的安全威胁。