Recently proposed automatic pathological speech classification techniques use unsupervised auto-encoders to obtain a high-level abstract representation of speech. Since these representations are learned based on reconstructing the input, there is no guarantee that they are robust to pathology-unrelated cues such as speaker identity information. Further, these representations are not necessarily discriminative for pathology detection. In this paper, we exploit supervised auto-encoders to extract robust and discriminative speech representations for Parkinson's disease classification. To reduce the influence of speaker variabilities unrelated to pathology, we propose to obtain speaker identity-invariant representations by adversarial training of an auto-encoder and a speaker identification task. To obtain a discriminative representation, we propose to jointly train an auto-encoder and a pathological speech classifier. Experimental results on a Spanish database show that the proposed supervised representation learning methods yield more robust and discriminative representations for automatically classifying Parkinson's disease speech, outperforming the baseline unsupervised representation learning system.
翻译:最近提出的自动病理语言分类方法使用不受监督的自动语音编码器,以获得高层次的语音抽象代表。由于这些表述方法是在重建输入的基础上学习的,因此无法保证这些表达方法对与病理学无关的提示,如语音身份信息等。此外,这些表述方法不一定对病理学检测具有歧视性。在本文中,我们利用受监督的自动编码器为帕金森病分类提取有力和歧视性的语音表达方法。为了减少与病理无关的演讲者变异性的影响,我们提议通过对自动编码器进行对抗性培训,并进行语音识别任务,从而获得发言者身份变异性陈述。为了获得一种具有歧视性的表述,我们提议联合培训一个自动编码器和一个病理语言分类器。西班牙数据库的实验结果表明,拟议的受监督的教学方法为自动区分帕金森病言语分类提供了更加有力和具有歧视性的表述方法。我们提议通过对一个不受监督的基线代表学习系统进行超度。