Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We propose a systems-oriented defense against this class of attacks and demonstrate its functionality for Amazon Alexa. We ensure that only the skills a user intends execute in response to voice commands. Our key insight is that we can interpret a user's intentions by analyzing their activity on counterpart systems of the web and smartphones. For example, the Lyft ride-sharing Alexa skill has an Android app and a website. Our work shows how information from counterpart apps can help reduce dis-ambiguities in the skill invocation process. We build SkilIFence, a browser extension that existing voice assistant users can install to ensure that only legitimate skills run in response to their commands. Using real user data from MTurk (N = 116) and experimental trials involving synthetic and organic speech, we show that SkillFence provides a balance between usability and security by securing 90.83% of skills that a user will need with a False acceptance rate of 19.83%.
翻译:然而,最近的工作表明,亚马逊亚历山大和谷歌之家等商业系统很容易受到利用设计问题的基于声音的混乱攻击。我们建议针对这类攻击采取以系统为导向的防御,并展示其功能。我们确保只有用户根据声音指令打算实施的技能。我们的关键见解是,我们可以通过分析用户在网络和智能手机对口系统上的活动来解释用户的意图。例如,莱夫特搭车亚历山大技艺有一个Android App和一个网站。我们的工作表明,来自对应应用程序的信息如何有助于减少技能丧失能力过程中的矛盾。我们建立了SkilIFence,这是现有语音助理用户可以安装的浏览器扩展,以确保只有合法技能才能响应其指令。我们使用来自MTurk(N=116)和涉及合成和有机语言的实验性用户数据,我们表明,SkillFence提供了可用性和安全之间的平衡,确保90.83%的技能的使用率和安全性,用户需要19.83%的错误接受率。