Sribosemomo:德语和其他语言的快速语音到文字模式 (Scribosermo: Fast Speech-to-Text models for German and other Languages)

Recent Speech-to-Text models often require a large amount of hardware resources and are mostly trained in English. This paper presents Speech-to-Text models for German, as well as for Spanish and French with special features: (a) They are small and run in real-time on microcontrollers like a RaspberryPi. (b) Using a pretrained English model, they can be trained on consumer-grade hardware with a relatively small dataset. (c) The models are competitive with other solutions and outperform them in German. In this respect, the models combine advantages of other approaches, which only include a subset of the presented features. Furthermore, the paper provides a new library for handling datasets, which is focused on easy extension with additional datasets and shows an optimized way for transfer-learning new languages using a pretrained model from another language with a similar alphabet.

翻译：最近的语音到文字模型往往需要大量硬件资源,而且大多用英语进行培训。本文介绍了德语以及西班牙语和法语的语音到文字模型,具有以下特点:(a)这些模型规模小,在像RaspberryPi这样的微型控制器上实时运行。 (b) 使用经过预先训练的英语模型,它们可以用相对较小的数据集接受消费级硬件培训。 (c) 这些模型与其他解决方案竞争,并且优于德国语。在这方面,这些模型结合了其他方法的优势,其中仅包括所展示的特征的一个子集。此外,该文件为处理数据集提供了一个新的图书馆,侧重于简单的扩展,同时增加数据集,并展示了一种最优化的方式,用来自类似字母的其他语言的经过预先训练的模型传授新语言。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/