A leaderboard named Speech processing Universal PERformance Benchmark (SUPERB), which aims at benchmarking the performance of a shared self-supervised learning (SSL) speech model across various downstream speech tasks with minimal modification of architectures and small amount of data, has fueled the research for speech representation learning. The SUPERB demonstrates speech SSL upstream models improve the performance of various downstream tasks through just minimal adaptation. As the paradigm of the self-supervised learning upstream model followed by downstream tasks arouses more attention in the speech community, characterizing the adversarial robustness of such paradigm is of high priority. In this paper, we make the first attempt to investigate the adversarial vulnerability of such paradigm under the attacks from both zero-knowledge adversaries and limited-knowledge adversaries. The experimental results illustrate that the paradigm proposed by SUPERB is seriously vulnerable to limited-knowledge adversaries, and the attacks generated by zero-knowledge adversaries are with transferability. The XAB test verifies the imperceptibility of crafted adversarial attacks.
翻译:名为 " 语音处理通用合规基准(SUPERB) " (SUPERB)的领导板,旨在为在各种下游演讲任务中共同自我监督的学习语言模型(SSL)的性能设定基准,同时尽量减少结构的修改和少量的数据,这为语音代表学习研究提供了动力。SUPERB展示了语音 SSL上游模型,通过微小的适应,改善了各种下游任务的业绩。作为自我监督的上游学习模型的范例,继之以下游任务,在演讲界引起更多的关注,将这种模式的对抗性强力定性为高度优先事项。在本文中,我们首次尝试调查这种模式在来自零知识对手和有限知识对手的攻击中具有的对抗性脆弱性。实验结果表明,SUPERB提出的模式极易受到有限知识对手的伤害,而零知识对手造成的攻击具有可转移性。XAB测试验证了编造的对抗性攻击的不可接受性。