Existing approaches to ensuring privacy of user speech data primarily focus on server-side approaches. While improving server-side privacy reduces certain security concerns, users still do not retain control over whether privacy is ensured on the client-side. In this paper, we define, evaluate, and explore techniques for client-side privacy in speech recognition, where the goal is to preserve privacy on raw speech data before leaving the client's device. We first formalize several tradeoffs in ensuring client-side privacy between performance, compute requirements, and privacy. Using our tradeoff analysis, we perform a large-scale empirical study on existing approaches and find that they fall short on at least one metric. Our results call for more research in this crucial area as a step towards safer real-world deployment of speech recognition systems at scale across mobile devices.
翻译:确保用户语音数据隐私的现有办法主要侧重于服务器侧面方法。改进服务器侧面隐私减少了某些安全关切,但用户仍无法控制客户方隐私是否得到保障。 在本文中,我们定义、评估和探索语音识别中客户方隐私的技术,目的是在离开客户的装置之前维护原始语音数据的隐私。我们首先正式确定在确保客户方隐私性能、计算要求和隐私之间的若干取舍。我们利用权衡分析,对现有方法进行了大规模的经验性研究,发现它们至少有一个标准不足。我们的结果要求在这个关键领域进行更多的研究,作为在移动设备之间大规模地在现实世界更安全地部署语音识别系统的一个步骤。