This paper describes the results of an informal collaboration launched during the African Master of Machine Intelligence (AMMI) in June 2020. After a series of lectures and labs on speech data collection using mobile applications and on self-supervised representation learning from speech, a small group of students and the lecturer continued working on automatic speech recognition (ASR) project for three languages: Wolof, Ga, and Somali. This paper describes how data was collected and ASR systems developed with a small amount (1h) of transcribed speech as training data. In these low resource conditions, pre-training a model on large amounts of raw speech was fundamental for the efficiency of ASR systems developed.
翻译:本文件介绍了非洲机器情报硕士(AMMI)在2020年6月发起的非正式合作的成果,在一系列关于使用移动应用收集语音数据以及自我监督的演讲代表制学习自言自语的讲座和实验室之后,一小群学生和讲师继续为Wolof、Ga和索马里三种语言(Wolof、Ga和索马里语)的自动语音识别项目开展工作,本文介绍了如何收集数据和以少量(1h)转录的语音系统发展成培训数据,在这些低资源条件下,对大量原始演讲模式进行预培训对于开发ASR系统的效率至关重要。