Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data. Such a noise perturbation often results in a severe performance degradation in automatic speech recognition (ASR) in order to meet a privacy budget $\varepsilon$. Private aggregation of teacher ensemble (PATE) utilizes ensemble probabilities to improve ASR accuracy when dealing with the noise effects controlled by small values of $\varepsilon$. We extend PATE learning to work with dynamic patterns, namely speech utterances, and perform a first experimental demonstration that it prevents acoustic data leakage in ASR training. We evaluate three end-to-end deep models, including LAS, hybrid CTC/attention, and RNN transducer, on the open-source LibriSpeech and TIMIT corpora. PATE learning-enhanced ASR models outperform the benchmark DP-SGD mechanisms, especially under strict DP budgets, giving relative word error rate reductions between 26.2% and 27.5% for an RNN transducer model evaluated with LibriSpeech. We also introduce a DP-preserving ASR solution for pretraining on public speech corpora.
翻译:差异隐私(DP)是保护用户信息的一个数据保护渠道,用于通过对隐私数据进行噪音扭曲来培训深层模型。这种噪音扰动往往导致自动语音识别(ASR)的性能严重退化,以达到隐私预算 $\ varepsilon$。教师合唱团(PATE)的私人集合利用各种可能性来提高ASR的准确性,在处理由小值$ varepsilon 控制的噪音效应时,处理ASR的准确性。我们将PATE的学习扩展至动态模式,即语音表达,并进行首次实验性示范,防止ASR培训中的声学数据泄漏。我们评估了三种端到端的深度模型,包括LAS、混合的CTC/注意和RNNT 传感器,关于开放源 LibriSpeech和TIMIT Cororaua。PASTA学习增强ASR模型超越了DP-SGD机制的基准,特别是在严格的DP预算下,在26.2%和27.5%之间相对字错误率降低了RNN的语音数据泄漏率。我们还引入了RED-Resuring Alistrual DPSpeachora DP-CRA 。