自动语音 Scower 自动语音标注议长有条件的等级建模 (Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring)

Automatic Speech Scoring (ASS) is the computer-assisted evaluation of a candidate's speaking proficiency in a language. ASS systems face many challenges like open grammar, variable pronunciations, and unstructured or semi-structured content. Recent deep learning approaches have shown some promise in this domain. However, most of these approaches focus on extracting features from a single audio, making them suffer from the lack of speaker-specific context required to model such a complex task. We propose a novel deep learning technique for non-native ASS, called speaker-conditioned hierarchical modeling. In our technique, we take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. We extract context vectors from these responses and feed them as additional speaker-specific context to our network to score a particular response. We compare our technique with strong baselines and find that such modeling improves the model's average performance by 6.92% (maximum = 12.86%, minimum = 4.51%). We further show both quantitative and qualitative insights into the importance of this additional context in solving the problem of ASS.

翻译：自动语音转换(ASS)是计算机辅助评估候选人语言熟练程度的一种语言。 ASS系统面临许多挑战,如开放式语法、可变发音和无结构或半结构化的内容。最近深层次的学习方法在这方面显示出一些希望。然而,这些方法大多侧重于从单一音频中提取特征,使这些特征因缺乏制作这种复杂任务所需的特定演讲人背景而受到影响。我们建议为非母语的ASS,即所谓的按语言要求的等级模型,采用新的深层次学习技术。在我们的技术中,我们利用口述能力测试对候选人进行多种反应。我们从这些答复中提取了上下文矢量矢量,并将它们作为我们网络的更多特定演讲人背景,以获得特定响应。我们把我们的方法与强有力的基线作比较,发现这种模型提高了6.92%(最高=12.86%,最低=4.51%)的平均性能。我们进一步从定量和定性角度展示了这一额外背景对于解决ASS问题的重要性。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日