改进使用自我监督预培训改进标签不完善的关键字 (Improving Label-Deficient Keyword Spotting Using Self-Supervised Pretraining)

In recent years, the development of accurate deep keyword spotting (KWS) models has resulted in KWS technology being embedded in a number of technologies such as voice assistants. Many of these models rely on large amounts of labelled data to achieve good performance. As a result, their use is restricted to applications for which a large labelled speech data set can be obtained. Self-supervised learning seeks to mitigate the need for large labelled data sets by leveraging unlabelled data, which is easier to obtain in large amounts. However, most self-supervised methods have only been investigated for very large models, whereas KWS models are desired to be small. In this paper, we investigate the use of self-supervised pretraining for the smaller KWS models in a label-deficient scenario. We pretrain the Keyword Transformer model using the self-supervised framework Data2Vec and carry out experiments on a label-deficient setup of the Google Speech Commands data set. It is found that the pretrained models greatly outperform the models without pretraining, showing that Data2Vec pretraining can increase the performance of KWS models in label-deficient scenarios. The source code is made publicly available.

翻译：近年来,精确的深关键字识别模型(KWS)的开发导致KWS技术被嵌入语音助理等许多技术中。许多这些模型依靠大量贴标签的数据才能取得良好绩效。因此,这些模型的使用仅限于可以获取大标记语音数据集的应用。自监学习的目的是通过利用无标签数据来减轻对大标记数据集的需要,而这种数据在大数量上更容易获得。然而,大多数自监方法只对非常大的模型进行了调查,而KWS模型则希望是小的。在本文中,我们调查了在标签缺失的情景下,如何对较小的KWS模型进行自我监督的预培训。我们使用自监的Geyword变换模型预先设计了自监框架Dat2Vec, 并试验了谷歌语音指令数据集的标签缺陷设置。发现,预先培训的模型在没有预先培训的情况下大大优于模型。显示DA2Vec预培训可以提高KWS模型在标签源代码假设中的可公开表现。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日