We present an approach to imbuing expressivity in a synthesized voice by acquiring a thematic analysis of 10 interviews with vocal studies and performance experts to inform the design framework for a real-time, interactive vocal persona that would generate compelling and appropriate contextually-dependent expression. The resultant tone of voice is defined as a point existing within a continuous, contextually-dependent probability space. The inclusion of voice persona in synthesized voice can be significant in a broad range of applications. Of particular interest is the potential impact in augmentative and assistive communication (AAC) community. Finally, we conclude with an introduction to our ongoing research investigating the themes of vocal persona and how they may continue to inform proposed expressive speech synthesis design frameworks.
 翻译:我们提出一种方法,通过对10次访谈进行专题分析,向声学研究和业绩专家提供专题分析,将声音的表达方式注入合成声音,为实时互动声音设计框架提供参考,这种互动声音将产生有说服力的、适合背景的表达方式,因此,声音的基调被定义为一个连续的、因背景而异的概率空间中存在的点,将声音人纳入合成声音在广泛的应用中可能很重要,特别令人感兴趣的是增强性和辅助性交流(AAC)社区的潜在影响。最后,我们介绍了我们正在进行的研究,调查声音人的主题,以及他们如何继续告知拟议的言语合成设计框架。