Recent advances in machine learning techniques are enabling Automated Speech Recognition (ASR) more accurate and practical. The evidence of this can be seen in the rising number of smart devices with voice processing capabilities. More and more devices around us are in-built with ASR technology. This poses serious privacy threats as speech contains unique biometric characteristics and personal data. However, the privacy concern can be mitigated if the voice features are processed in the encrypted domain. Within this context, this paper proposes an algorithm to redesign the back-end of the speaker verification system using fully homomorphic encryption techniques. The solution exploits the Cheon-Kim-Kim-Song (CKKS) fully homomorphic encryption scheme to obtain a real-time and non-interactive solution. The proposed solution contains a novel approach based on Newton Raphson method to overcome the limitation of CKKS scheme (i.e., calculating an inverse square-root of an encrypted number). This provides an efficient solution with less multiplicative depths for a negligible loss in accuracy. The proposed algorithm is validated using a well-known speech dataset. The proposed algorithm performs encrypted-domain verification in real-time (with less than 1.3 seconds delay) for a 2.8\% equal-error-rate loss compared to plain-domain verification.
翻译:机器学习技术的最近进展使得自动语音识别(ASR)更加准确和实用。这可以从具有语音处理能力的智能设备数量的增加中看出。我们周围越来越多的智能设备都是用ASR技术安装的。这构成了严重的隐私威胁,因为语音含有独特的生物特征和个人数据。然而,如果在加密域内处理语音特征,隐私问题就可以减轻。在此背景下,本文件建议采用一种算法,利用完全同质加密技术重新设计语音验证系统的后端。解决方案利用千音-金-金-宋(CKKS)完全同性加密方案(Cheon-Kim-Kim-Song)来获得实时和非互动的解决方案。拟议解决方案包含基于Newton Raphson方法的新颖方法,以克服CKKS系统(即计算一个加密数字的反平方根)的局限性。这提供了一种效率解决方案,使用不那么多的多的深度来进行可忽略的准确性损失。提议的算法是使用一个众所周知的语音数据集验证的。拟议的算法在实际时间(1.3秒以平方秒以内进行加密-dal-domain核查,比平方平时的平方平方秒)的平时的平时的平差核查。