通用电话:强声建模多语种数据集 (Common Phone: A Multilingual Dataset for Robust Acoustic Modelling)

Current state of the art acoustic models can easily comprise more than 100 million parameters. This growing complexity demands larger training datasets to maintain a decent generalization of the final decision function. An ideal dataset is not necessarily large in size, but large with respect to the amount of unique speakers, utilized hardware and varying recording conditions. This enables a machine learning model to explore as much of the domain-specific input space as possible during parameter estimation. This work introduces Common Phone, a gender-balanced, multilingual corpus recorded from more than 76.000 contributors via Mozilla's Common Voice project. It comprises around 116 hours of speech enriched with automatically generated phonetic segmentation. A Wav2Vec 2.0 acoustic model was trained with the Common Phone to perform phonetic symbol recognition and validate the quality of the generated phonetic annotation. The architecture achieved a PER of 18.1 % on the entire test set, computed with all 101 unique phonetic symbols, showing slight differences between the individual languages. We conclude that Common Phone provides sufficient variability and reliable phonetic annotation to help bridging the gap between research and application of acoustic models.

翻译：艺术声学模型目前的状况很容易包含超过1亿个参数。这种日益复杂的复杂性要求增加培训数据集,以保持对最终决定功能的体面概括化。理想的数据集规模不一定很大,但对于独特的扬声器、使用硬件和不同记录条件的数量而言,则很大。这使机器学习模型能够在参数估计期间尽可能多地探索特定领域的输入空间。这项工作引入了共同电话,这是一个性别平衡的多语种,通过Mozilla的通用语音项目记录了76 000多个贡献者提供的多语种。它包含大约116小时的语音,通过自动生成的音断层进行丰富。 Wav2Vec 2.0 声学模型与通用电话进行了培训,以进行语音符号识别并验证生成的语音注的质量。该结构在整个测试集中实现了18.1%的PER,以所有101个独特的语音符号计算,显示了个别语言之间的微小差异。我们的结论是,共同电话提供了足够的变异性和可靠的音说明,有助于缩小声学模型的研究和应用之间的差距。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/