关于将预先培训的SSL、ASR、LM和SLU语言理解模型结合起来的研究 (A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding)

Collecting sufficient labeled data for spoken language understanding (SLU) is expensive and time-consuming. Recent studies achieved promising results by using pre-trained models in low-resource scenarios. Inspired by this, we aim to ask: which (if any) pre-training strategies can improve performance across SLU benchmarks? To answer this question, we employ four types of pre-trained models and their combinations for SLU. We leverage self-supervised speech and language models (LM) pre-trained on large quantities of unpaired data to extract strong speech and text representations. We also explore using supervised models pre-trained on larger external automatic speech recognition (ASR) or SLU corpora. We conduct extensive experiments on the SLU Evaluation (SLUE) benchmark and observe self-supervised pre-trained models to be more powerful, with pre-trained LM and speech models being most beneficial for the Sentiment Analysis and Named Entity Recognition task, respectively.

翻译：为口语理解收集足够的标签数据既费钱又费时。最近的研究通过在低资源情况下使用经过预先培训的模型取得了大有希望的成果。受此启发,我们想问:哪些(如果有的话)培训前战略可以提高SLU基准的绩效?为了回答这个问题,我们为SLU采用了四种经过培训的模型及其组合。我们利用自我监督的语音和语言模型(LM)对大量未经培训的数据进行预先培训,以获得强有力的语音和文本表述。我们还探索使用经过监督的关于更大的外部自动语音识别(ASR)或SLU Corpora的事先培训模型。我们在SLU评价基准上进行广泛的实验,并观察经过自我监督的事先培训的模式,以便更强大。我们经过事先培训的LM和语言模型分别对传感器分析和命名实体识别任务最为有益。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日