We propose using an adversarial autoencoder (AAE) to replace generative adversarial network (GAN) in the private aggregation of teacher ensembles (PATE), a solution for ensuring differential privacy in speech applications. The AAE architecture allows us to obtain good synthetic speech leveraging upon a discriminative training of latent vectors. Such synthetic speech is used to build a privacy-preserving classifier when non-sensitive data is not sufficiently available in the public domain. This classifier follows the PATE scheme that uses an ensemble of noisy outputs to label the synthetic samples and guarantee $\varepsilon$-differential privacy (DP) on its derived classifiers. Our proposed framework thus consists of an AAE-based generator and a PATE-based classifier (PATE-AAE). Evaluated on the Google Speech Commands Dataset Version II, the proposed PATE-AAE improves the average classification accuracy by +$2.11\%$ and +$6.60\%$, respectively, when compared with alternative privacy-preserving solutions, namely PATE-GAN and DP-GAN, while maintaining a strong level of privacy target at $\varepsilon$=0.01 with a fixed $\delta$=10$^{-5}$.
翻译:我们建议使用对抗性自动读数器(AAE)取代教师集体私人聚合(PATE)中的基因对抗网络(GAN),这是确保言论应用中隐私差异的一种解决办法。AAE结构允许我们在对潜在矢量进行歧视培训时获得良好的合成语言。这种合成语言用于在公共领域无法充分获得非敏感数据时建立一个隐私保护分类器。该分类器遵循PATE计划,即使用一组噪音输出来标注合成样品,并保障其衍生分类器上$\varepslon$差异隐私(DP),因此我们提议的框架包括一个基于AE的生成器和一个基于PATE的分类器(PATE-AE)。在谷歌语音指令数据集第二版上进行了评估,拟议的PATE-AE提高了平均分类准确度,与替代的隐私保护解决方案(即PATE-GAN$和DP-GAN$=1美元)相比,平均分类精确度分别为+2.11美元和6.60美元,与替代的隐私保护解决方案相比,即PATE-GAN和DP-GAN=1美元,同时维持一个强大的目标水平。