With the development of hardware and algorithms, ASR(Automatic Speech Recognition) systems evolve a lot. As The models get simpler, the difficulty of development and deployment become easier, ASR systems are getting closer to our life. On the one hand, we often use APPs or APIs of ASR to generate subtitles and record meetings. On the other hand, smart speaker and self-driving car rely on ASR systems to control AIoT devices. In past few years, there are a lot of works on adversarial examples attacks against ASR systems. By adding a small perturbation to the waveforms, the recognition results make a big difference. In this paper, we describe the development of ASR system, different assumptions of attacks, and how to evaluate these attacks. Next, we introduce the current works on adversarial examples attacks from two attack assumptions: white-box attack and black-box attack. Different from other surveys, we pay more attention to which layer they perturb waveforms in ASR system, the relationship between these attacks, and their implementation methods. We focus on the effect of their works.
翻译:随着硬件和算法的开发,ASR(自动语音识别)系统发展了很多。随着模型的简化,开发和部署的难度越来越小,ASR系统也越来越接近我们的生活。一方面,我们经常使用ASR的APP或APP(APP)来制作字幕和记录会议。另一方面,聪明的演讲人和自行驾驶的汽车依靠ASR系统来控制AIOT装置。在过去几年里,对ASR系统进行对抗性攻击的例子很多。通过在波形上增加小的扰动,承认的结果就大有不同。在本文中,我们描述了ASR系统的开发、不同的攻击假设和如何评价这些攻击。接下来,我们从两种攻击假设(白箱攻击和黑箱攻击)中介绍目前关于对抗性攻击的例子攻击的作品。不同于其他调查,我们更多地注意ASR系统中的哪一层波形、这些攻击之间的关系及其实施方法。我们集中关注这些攻击的效果。