High-quality articulatory speech synthesis has many potential applications in speech science and technology. However, developing appropriate mappings from linguistic specification to articulatory gestures is difficult and time consuming. In this paper we construct an optimisation-based framework as a first step towards learning these mappings without manual intervention. We demonstrate the production of syllables with complex onsets and discuss the quality of the articulatory gestures with reference to coarticulation.
翻译:高质量的动脉话语合成在语言科学和技术方面有许多潜在应用,然而,从语言规格到动脉手势进行适当的绘图既困难又费时,在本文中,我们构建了一个以优化为基础的框架,作为在没有人工干预的情况下学习这些制图的第一步,我们展示了具有复杂发音的音节的制作,并讨论了脉动手势的质量,并提到了脉动手势的质量。