Besides its linguistic content, our speech is rich in biometric information that can be inferred by classifiers. Learning privacy-preserving representations for speech signals enables downstream tasks without sharing unnecessary, private information about an individual. In this paper, we show that protecting gender information in speech is more effective than modelling speaker-identity information only when generating a non-sensitive representation of speech. Our method relies on reconstructing speech by decoding linguistic content along with gender information using a variational autoencoder. Specifically, we exploit disentangled representation learning to encode information about different attributes into separate subspaces that can be factorised independently. We present a novel way to encode gender information and disentangle two sensitive biometric identifiers, namely gender and identity, in a privacy-protecting setting. Experiments on the LibriSpeech dataset show that gender recognition and speaker verification can be reduced to a random guess, protecting against classification-based attacks, while maintaining the utility of the signal for speech recognition.
翻译:除了语言内容外,我们的演讲还丰富了可由分类者推断的生物学信息。学习隐私保护语言信号代表可以进行下游任务,而不必分享不必要的私人个人信息。在本文中,我们表明,在语言中保护性别信息比在生成非敏感语言代表时模拟语音身份信息更为有效。我们的方法依靠通过使用变式自动编码器解码语言内容和性别信息来重建语言内容和性别信息。具体地说,我们利用分解的代表性学习将关于不同属性的信息编码成可以独立计算的不同子空间。我们提出了一种新颖的方法,在隐私保护环境中将性别信息编码并解开两个敏感的生物学识别特征,即性别和身份。LibriSpeech数据集的实验表明,性别识别和语音核实可以降低为随机猜测,防止基于分类的攻击,同时保持语音识别信号的实用性。