在非母语英语语言中检测到受微弱监督的单词级发音级发音错误 (Weakly-supervised word-level pronunciation error detection in non-native English speech)

We propose a weakly-supervised model for word-level mispronunciation detection in non-native (L2) English speech. To train this model, phonetically transcribed L2 speech is not required and we only need to mark mispronounced words. The lack of phonetic transcriptions for L2 speech means that the model has to learn only from a weak signal of word-level mispronunciations. Because of that and due to the limited amount of mispronounced L2 speech, the model is more likely to overfit. To limit this risk, we train it in a multi-task setup. In the first task, we estimate the probabilities of word-level mispronunciation. For the second task, we use a phoneme recognizer trained on phonetically transcribed L1 speech that is easily accessible and can be automatically annotated. Compared to state-of-the-art approaches, we improve the accuracy of detecting word-level pronunciation errors in AUC metric by 30% on the GUT Isle Corpus of L2 Polish speakers, and by 21.5% on the Isle Corpus of L2 German and Italian speakers.

翻译：我们建议对非本地语言( L2) 的字级错误发音检测模式进行监管不力的测试。为了培训这一模式, 不需要对L2 语言进行语音转录, 我们只需要标记错误发音的单词。 L2 语言缺少语音转录, 意味着该模式只能从单级错误发音的微弱信号中学习。由于这个原因, 并且由于错误发音的L2 语言表达方式数量有限, 该模式更可能过度适用。为了限制这一风险, 我们用多任务设置来培训它。在第一项任务中, 我们估计了字级错误发音的概率。在第二项任务中, 我们使用经培训的语音转录制L1 语言的电话识别器, 这很容易读取, 并且可以自动附加注释。与最先进的方法相比, 我们提高了在AUC 中发现字级读音错误的准确度, 30% 在L2 波兰语演讲者GUT Island Corus 上, 21.5% 在意大利语和 Lus 公司。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日