Using phonological speech vocoding, we propose a platform for exploring relations between phonology and speech processing, and in broader terms, for exploring relations between the abstract and physical structures of a speech signal. Our goal is to make a step towards bridging phonology and speech processing and to contribute to the program of Laboratory Phonology. We show three application examples for laboratory phonology: compositional phonological speech modelling, a comparison of phonological systems and an experimental phonological parametric text-to-speech (TTS) system. The featural representations of the following three phonological systems are considered in this work: (i) Government Phonology (GP), (ii) the Sound Pattern of English (SPE), and (iii) the extended SPE (eSPE). Comparing GP- and eSPE-based vocoded speech, we conclude that the latter achieves slightly better results than the former. However, GP - the most compact phonological speech representation - performs comparably to the systems with a higher number of phonological features. The parametric TTS based on phonological speech representation, and trained from an unlabelled audiobook in an unsupervised manner, achieves intelligibility of 85% of the state-of-the-art parametric speech synthesis. We envision that the presented approach paves the way for researchers in both fields to form meaningful hypotheses that are explicitly testable using the concepts developed and exemplified in this paper. On the one hand, laboratory phonologists might test the applied concepts of their theoretical models, and on the other hand, the speech processing community may utilize the concepts developed for the theoretical phonological models for improvements of the current state-of-the-art applications.
翻译:我们提出一个平台,以探讨声学和语音处理之间的关系,以及更广义地探讨语音信号的抽象和物理结构之间的关系。我们的目标是向声学和语音处理过渡迈出一步,并为实验室声学方案作出贡献。我们展示了实验室声学的三个应用实例:声学语言建模、声学系统比较和试验性声学模拟文本到语音系统(TTS)系统。在这项工作中考虑了以下三个声学系统的演练表现:(一) 政府声学(GP),(二) 英语音响模式(SPE),和(三) 扩展的SPE(eSPE) 。我们用GP-和eSPE的语音调音学模型比较了三个应用实例: 声学语言模拟、 声学系统比较和实验性文字学模拟(TTS)系统。 GP- 最精细的声学语音表解方法(TTTS-halthal ) 向具有较高声学特征的系统进行可比较的演练表现。我们从理论- TTTS应用的理论- TTS 理论- 演化的演算方法可以用了一种不透明的模拟语言模型模式, 和演化的演化的演化的演化了一种手式, 演化的演算方法,我们用了一种演化的声学模型的演化的演化的演化的演化的音学模型的演化的演化的演化了一种手式, 演化的演化的演化的演化的演化的演化的演化的演化的演化了一种演化了一种演化方式, 演化的演化的演化方式, 的演化式的演化了一种演化了一种演化的演化的演化的演化的演化的演化的演化的演化方式,用方式,用方式,用方式,我们的演化式的演化式的演化式的演化式的演化的演化的演化的演化式的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化方式,用法,用式的演化