声音选择:带有语义差异的图形模型到光谱模型 (SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation)

End-to-end speech synthesis models directly convert the input characters into an audio representation (e.g., spectrograms). Despite their impressive performance, such models have difficulty disambiguating the pronunciations of identically spelled words. To mitigate this issue, a separate Grapheme-to-Phoneme (G2P) model can be employed to convert the characters into phonemes before synthesizing the audio. This paper proposes SoundChoice, a novel G2P architecture that processes entire sentences rather than operating at the word level. The proposed architecture takes advantage of a weighted homograph loss (that improves disambiguation), exploits curriculum learning (that gradually switches from word-level to sentence-level G2P), and integrates word embeddings from BERT (for further performance improvement). Moreover, the model inherits the best practices in speech recognition, including multi-task learning with Connectionist Temporal Classification (CTC) and beam search with an embedded language model. As a result, SoundChoice achieves a Phoneme Error Rate (PER) of 2.65% on whole-sentence transcription using data from LibriSpeech and Wikipedia. Index Terms grapheme-to-phoneme, speech synthesis, text-tospeech, phonetics, pronunciation, disambiguation.

翻译：终端到终端语音合成模型直接将输入字符转换成音频表达式(例如光谱图 ) 。尽管这些模型的性能令人印象深刻, 却难以掩盖相同拼写词的发音。为了缓解这一问题, 在合成音频之前, 可以使用一个单独的图形化合成模型( G2P ) 将字符转换成语音。本文提议了 SoundChoice, 这是一种处理整个句子而不是在字级操作的新型G2P 结构。所拟议的结构利用了加权同质系统损失( 改进了调和), 利用课程学习( 从字级逐渐转换到句级G2P ), 并整合了来自BERT( 进一步改进性能) 的单词嵌入。此外, 模型继承了语音识别的最佳做法, 包括与连接温度分类( CTC) 进行多功能化学习, 并用嵌入的语言模型进行搜索。因此, 声音化公司在整部的语音错误率( PER ) 中, 在整部、读取的语音- IMVIBS- IP IP IP 上的数据, IP 。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/