实验室录音录音的语音语音语音编码 (Speech vocoding for laboratory phonology)

Using phonological speech vocoding, we propose a platform for exploring relations between phonology and speech processing, and in broader terms, for exploring relations between the abstract and physical structures of a speech signal. Our goal is to make a step towards bridging phonology and speech processing and to contribute to the program of Laboratory Phonology. We show three application examples for laboratory phonology: compositional phonological speech modelling, a comparison of phonological systems and an experimental phonological parametric text-to-speech (TTS) system. The featural representations of the following three phonological systems are considered in this work: (i) Government Phonology (GP), (ii) the Sound Pattern of English (SPE), and (iii) the extended SPE (eSPE). Comparing GP- and eSPE-based vocoded speech, we conclude that the latter achieves slightly better results than the former. However, GP - the most compact phonological speech representation - performs comparably to the systems with a higher number of phonological features. The parametric TTS based on phonological speech representation, and trained from an unlabelled audiobook in an unsupervised manner, achieves intelligibility of 85% of the state-of-the-art parametric speech synthesis. We envision that the presented approach paves the way for researchers in both fields to form meaningful hypotheses that are explicitly testable using the concepts developed and exemplified in this paper. On the one hand, laboratory phonologists might test the applied concepts of their theoretical models, and on the other hand, the speech processing community may utilize the concepts developed for the theoretical phonological models for improvements of the current state-of-the-art applications.

翻译：我们提出一个平台,以探讨声学和语音处理之间的关系,以及更广义地探讨语音信号的抽象和物理结构之间的关系。我们的目标是向声学和语音处理过渡迈出一步,并为实验室声学方案作出贡献。我们展示了实验室声学的三个应用实例:声学语言建模、声学系统比较和试验性声学模拟文本到语音系统(TTS)系统。在这项工作中考虑了以下三个声学系统的演练表现:(一) 政府声学(GP),(二) 英语音响模式(SPE),和(三) 扩展的SPE(eSPE) 。我们用GP-和eSPE的语音调音学模型比较了三个应用实例: 声学语言模拟、声学系统比较和实验性文字学模拟(TTS)系统。 GP- 最精细的声学语音表解方法(TTTS-halthal ) 向具有较高声学特征的系统进行可比较的演练表现。我们从理论- TTTS应用的理论- TTS 理论- 演化的演算方法可以用了一种不透明的模拟语言模型模式, 和演化的演化的演化的演化了一种手式, 演化的演算方法,我们用了一种演化的声学模型的演化的演化的演化的演化的音学模型的演化的演化的演化了一种手式, 演化的演化的演化的演化的演化的演化的演化的演化的演化了一种演化了一种演化方式, 演化的演化的演化方式, 的演化式的演化了一种演化了一种演化的演化的演化的演化的演化的演化的演化的演化方式,用方式,用方式,用方式,我们的演化式的演化式的演化式的演化式的演化的演化的演化的演化式的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化的演化方式,用法,用式的演化

相关内容

SPE

关注 1

软件：实践和经验是一种国际上受尊重的、经过严格审查的工具，用于传播和讨论在软件系统和应用程序中使用新的和既定的技术和工具的实践经验。论文发表的关键标准是它做出了一项新的贡献，从事软件设计和/或实现的其他研究人员和实践者可能从中受益。提交的稿件必须是以前未发表过的原稿，并且不考虑在其他地方发表。该杂志重点是软件的实践和经验。文章中所包含的理论或数学内容有助于证明贡献和理解的严格基础，最终导致更好的实际系统的发展。官网地址： http://dblp.uni-trier.de/db/journals/spe/

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

专知会员服务

26+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日