In the following report we propose pipelines for Goodness of Pronunciation (GoP) computation solving OOV problem at testing time using Vocab/Lexicon expansion techniques. The pipeline uses different components of ASR system to quantify accent and automatically evaluate them as scores. We use the posteriors of an ASR model trained on native English speech, along with the phone level boundaries to obtain phone level pronunciation scores. We used this as a baseline pipeline and implemented methods to remove UNK and SPN phonemes in the GoP output by building three pipelines. The Online, Offline and Hybrid pipeline which returns the scores but also can prevent unknown words in the final output. The Online method is based per utterance, Offline method pre-incorporates a set of OOV words for a given data set and the Hybrid method combines the above two ideas to expand the lexicon as well work per utterance. We further provide utilities such as the Phoneme to posterior mappings, GoP scores of each utterance as a vector, and Word boundaries used in the GoP pipeline for use in future research.
翻译:在下一份报告中,我们提议利用Vocab/Lexico扩展技术,在测试时用“良好读音”计算管道解决OOOV问题。管道使用ASR系统的不同组成部分来量化口音并自动评定其分数。我们使用受过本地英语语言培训的ASR模型的后台以及电话级别边界来获取电话水平发音分数。我们用这个基线管道和采用的方法,通过建造三条管道来清除GoP输出中的UNK和SPN电话。在线、离线和混合管道中返回得分,但也可以防止最后输出中出现未知的单词。在线方法以每个词为基础,离线方法为特定数据集配置一套OOOVT字,混合方法将以上两个想法结合起来,以扩大词汇和按语进行工作。我们还提供公用事业,例如用于海底绘图的电话、作为矢量的每条话的GoP分,以及用于GoP管道中用于未来研究的单词边界。