马其顿辅助技术应用演讲合成 (Macedonian Speech Synthesis for Assistive Technology Applications)

Bojan Sofronievski,Elena Velovska,Martin Velichkovski,Violeta Argirova,Tea Veljkovikj,Risto Chavdarov,Stefan Janev,Kristijan Lazarev,Toni Bachvarovski,Zoran Ivanovski,Dimitar Tashkovski,Branislav Gerazov

from arxiv, 5 pages, 1 figure, EUSIPCO conference 2022

Speech technology is becoming ever more ubiquitous with the advance of speech enabled devices and services. The use of speech synthesis in Augmentative and Alternative Communication tools, has facilitated inclusion of individuals with speech impediments allowing them to communicate with their surroundings using speech. Although there are numerous speech synthesis systems for the most spoken world languages, there is still a limited offer for smaller languages. We propose and compare three models built using parametric and deep learning techniques for Macedonian trained on a newly recorded corpus. We target low-resource edge deployment for Augmentative and Alternative Communication and assistive technologies, such as communication boards and screen readers. The listening test results show that parametric speech synthesis is as performant compared to the more advanced deep learning models. Since it also requires less resources, and offers full speech rate and pitch control, it is the preferred choice for building a Macedonian TTS system for this application scenario.

翻译：随着语音辅助装置和服务的进步,语音技术越来越普遍。在辅助和替代性交流工具中使用语音合成,便利了使用语音障碍的个人与使用语音的周围环境进行交流。虽然对最通用世界语言来说有许多语音合成系统,但对较小语言的报价仍然有限。我们提出并比较了三种模型,这三种模型是使用在新录制的文体上受过马其顿培训的参数和深层次学习技术建造的。我们把低资源边缘部署用于辅助和替代性交流以及辅助技术,例如通信板和屏幕阅读器。听觉测试结果表明,与较先进的深层次学习模式相比,参数语音合成与表现一样。由于它需要的资源较少,而且提供完整的语音率和声控,因此为这一应用情景建造马其顿TTS系统更可取。

相关内容

语音合成

关注 491

语音合成（Speech Synthesis），也称为文语转换（Text-to-Speech, TTS,它是将任意的输入文本转换成自然流畅的语音输出。语音合成涉及到人工智能、心理学、声学、语言学、数字信号处理、计算机科学等多个学科技术，是信息处理领域中的一项前沿技术。随着计算机技术的不断提高，语音合成技术从早期的共振峰合成,逐步发展为波形拼接合成和统计参数语音合成，再发展到混合语音合成；合成语音的质量、自然度已经得到明显提高，基本能满足一些特定场合的应用需求。目前，语音合成技术在银行、医院等的信息播报系统、汽车导航系统、自动应答呼叫中心等都有广泛应用，取得了巨大的经济效益。另外，随着智能手机、MP3、PDA 等与我们生活密切相关的媒介的大量涌现，语音合成的应用也在逐渐向娱乐、语音教学、康复治疗等领域深入。可以说语音合成正在影响着人们生活的方方面面。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日