With recent advances in autonomous driving, Voice Control Systems have become increasingly adopted as human-vehicle interaction methods. This technology enables drivers to use voice commands to control the vehicle and will be soon available in Advanced Driver Assistance Systems (ADAS). Prior work has shown that Siri, Alexa and Cortana, are highly vulnerable to inaudible command attacks. This could be extended to ADAS in real-world applications and such inaudible command threat is difficult to detect due to microphone nonlinearities. In this paper, we aim to develop a more practical solution by using camera views to defend against inaudible command attacks where ADAS are capable of detecting their environment via multi-sensors. To this end, we propose a novel multimodal deep learning classification system to defend against inaudible command attacks. Our experimental results confirm the feasibility of the proposed defense methods and the best classification accuracy reaches 89.2%. Code is available at https://github.com/ITSEG-MQ/Sensor-Fusion-Against-VoiceCommand-Attacks.
翻译:随着自主驾驶的最新进展,语音控制系统已日益被采用为载人车辆互动方法。这一技术使驾驶员能够使用语音指令来控制车辆,并将很快在高级驾驶协助系统(ADAS)中提供。先前的工作表明,Siri、Alexa和Cortana极易受到无法听闻的指挥攻击。在现实世界应用中,这可以扩大到ADAS,由于麦克风非线性,这种无法听闻的指挥威胁难以探测。在本文中,我们的目标是开发一种更实用的解决办法,利用相机视图来防御ADAS能够通过多传感器探测其环境的无法听觉指挥攻击。为此,我们提议建立一个新的多式深层次学习分类系统,以防御无法听闻的指挥攻击。我们的实验结果证实拟议防御方法的可行性,最佳分类精确度达到89.2%。代码可在https://github.com/ITSEG-MQ/Sensor-Fusion-Agest-ViceComund-Attacks查阅。