The largest dataset of Arabic speech mispronunciation detections in Egyptian dialogues is introduced. The dataset is composed of annotated audio files representing the top 100 words that are most frequently used in the Arabic language, pronounced by 100 Egyptian children (aged between 2 and 8 years old). The dataset is collected and annotated on segmental pronunciation error detections by expert listeners.
翻译:在埃及对话中引入了阿拉伯语语言读音错误检测的最大数据集。数据集由附加说明的音频文件组成,代表阿拉伯语最常用的100个字,由100名埃及儿童(年龄在2至8岁之间)宣布。数据集由专家听众收集,并附加说明部分读音错误检测。