争取建立数据效率高的模型,以建立 " Wake Word Spoint " (Towards Data-efficient Modeling for Wake Word Spotting)

Wake word (WW) spotting is challenging in far-field not only because of the interference in signal transmission but also the complexity in acoustic environments. Traditional WW model training requires large amount of in-domain WW-specific data with substantial human annotations therefore it is hard to build WW models without such data. In this paper we present data-efficient solutions to address the challenges in WW modeling, such as domain-mismatch, noisy conditions, limited annotation, etc. Our proposed system is composed of a multi-condition training pipeline with a stratified data augmentation, which improves the model robustness to a variety of predefined acoustic conditions, together with a semi-supervised learning pipeline to accurately extract the WW and confusable examples from untranscribed speech corpus. Starting from only 10 hours of domain-mismatched WW audio, we are able to enlarge and enrich the training dataset by 20-100 times to capture the acoustic complexity. Our experiments on real user data show that the proposed solutions can achieve comparable performance of a production-grade model by saving 97\% of the amount of WW-specific data collection and 86\% of the bandwidth for annotation.

翻译：(WW) Wake word (WW) 定位在远处是具有挑战性的,不仅因为信号传输受到干扰,而且声学环境也复杂。传统的WW模式培训需要大量的内部WWW特定数据,并有大量的人文说明,因此很难在没有这种数据的情况下建立WW模型。在本文中,我们提出数据效率高的解决方案,以应对WWW建模方面的挑战,如域相配、噪音条件、有限的批注等。我们提议的系统由多功能培训管道组成,配有分层数据增强,使模型对各种预先界定的声学条件更加稳健,并配有半封存的学习管道,以准确提取WWWWW和从未划定的语音资料中互换实例。从仅10小时的域相配制WWWW音频谱开始,我们可以扩大和丰富20-100次的培训数据,以捕捉声学复杂性。我们对实际用户数据的实验表明,拟议解决方案可以通过节省WWW具体数据收集量和86-NT带宽度实现生产级模型的可比性。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

专知会员服务

24+阅读 · 2020年4月13日