We present Vakyansh, an end to end toolkit for Speech Recognition in Indic languages. India is home to almost 121 languages and around 125 crore speakers. Yet most of the languages are low resource in terms of data and pretrained models. Through Vakyansh, we introduce automatic data pipelines for data creation, model training, model evaluation and deployment. We create 14,000 hours of speech data in 23 Indic languages and train wav2vec 2.0 based pretrained models. These pretrained models are then finetuned to create state of the art speech recognition models for 18 Indic languages which are followed by language models and punctuation restoration models. We open source all these resources with a mission that this will inspire the speech community to develop speech first applications using our ASR models in Indic languages.
翻译:我们用印度语介绍结束语音识别工具包Vakyansh, 结束印度语的结束工具包。印度有近121种语言,约有125种作物语言。然而,大多数语言在数据和预先培训的模式方面资源很少。我们通过Vakyansh引入数据生成、模式培训、模型评估和部署的自动数据管道。我们用23种印度语创建14 000小时语音数据,并培训基于 wav2vec 2.0 的预先培训模式。然后,对这些预先培训的模式进行微调,为18种印度语创建最新语音识别模式,随后又建立语言模型和标点恢复模式。我们打开了所有这些资源的来源,其使命是激励语言界使用我们用印度语的ASR模型开发首个语音应用。