WaBERT: 讲口语理解和语音对BERT调整的低资源端至端模式 (WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 可理解性 · 知识 (knowledge) · 端到端 ·

2022 年 4 月 22 日

WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment

翻译：WaBERT: 讲口语理解和语音对BERT调整的低资源端至端模式

Lin Yao,Jianfei Song,Ruizhuo Xu,Yingfang Yang,Zijian Chen,Yafeng Deng

Historically lower-level tasks such as automatic speech recognition (ASR) and speaker identification are the main focus in the speech field. Interest has been growing in higher-level spoken language understanding (SLU) tasks recently, like sentiment analysis (SA). However, improving performances on SLU tasks remains a big challenge. Basically, there are two main methods for SLU tasks: (1) Two-stage method, which uses a speech model to transfer speech to text, then uses a language model to get the results of downstream tasks; (2) One-stage method, which just fine-tunes a pre-trained speech model to fit in the downstream tasks. The first method loses emotional cues such as intonation, and causes recognition errors during ASR process, and the second one lacks necessary language knowledge. In this paper, we propose the Wave BERT (WaBERT), a novel end-to-end model combining the speech model and the language model for SLU tasks. WaBERT is based on the pre-trained speech and language model, hence training from scratch is not needed. We also set most parameters of WaBERT frozen during training. By introducing WaBERT, audio-specific information and language knowledge are integrated in the short-time and low-resource training process to improve results on the dev dataset of SLUE SA tasks by 1.15% of recall score and 0.82% of F1 score. Additionally, we modify the serial Continuous Integrate-and-Fire (CIF) mechanism to achieve the monotonic alignment between the speech and text modalities.

翻译：在历史上较低层次的任务,如自动语音识别(ASR)和语音识别,是语言领域的主要重点。最近,对高层次口语理解(SLU)任务的兴趣不断增长,比如情绪分析(SA)等。然而,改进SLU任务的表现仍是一个巨大的挑战。基本上,SLU任务有两种主要方法:(1) 两阶段方法,即使用语言模式将语言转换为文字,然后使用语言模式获得下游任务的结果;(2) 一阶段方法,只是微调一种经过培训的语音模型,适合下游任务。第一种方法,最近对高级口语理解(SLU)越来越感兴趣。第一种方法在ASR进程中失去了情感提示,导致识别错误,而第二个方法则缺乏必要的语言知识。在本文件中,我们建议采用Wave BERT(WABERT),这是将语言模式转换为语言模型和语言模式的新型端点模式。WABERT,因此不需要从零开始培训。在培训期间,WABERT(WERT)的大多数参数被冻结,例如入入入了入了SBERERT(S-BERT) Ral-BER) Ralalal roalalal rodu rodualalalalal rodude) rodu rodu rodu rodude rodu rodu rodualalalalalal rodu rodu rodu rodudalal rodu rodal ro) rodu rodu,我们Silal 和SL rodu

0

相关内容

语言模型化

语言模型化

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【CCL 2019】如何微调BERT进行文本分类？（How to Fine-Tune BERT for Text Classification?）

【CCL 2019】如何微调BERT进行文本分类？（How to Fine-Tune BERT for Text Classification?）

专知会员服务

84+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

小系统中的反常扩散和Kramers逃逸问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

多复变函数空间上(加权)复合算子的动力学性质

国家自然科学基金

1+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

含第三组元共晶合金定向凝固中生长界面形貌的演化

国家自然科学基金

0+阅读 · 2012年12月31日

9.4T下猴脊髓损伤后大脑结构可塑性的磁共振成像研究

国家自然科学基金

0+阅读 · 2012年12月31日

ECAP中fcc织构转变规律及其与微观组织演化的交互作用

国家自然科学基金

0+阅读 · 2012年12月31日

两类Monge-Ampere方程问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

微合金化Al-Cu-Sc合金的多重强化和多尺度断裂行为研究

国家自然科学基金

0+阅读 · 2011年12月31日

量子discord及其在量子计算中的研究

国家自然科学基金

1+阅读 · 2011年12月31日

Revisiting End-to-End Speech-to-Text Translation From Scratch

Arxiv

0+阅读 · 2022年6月9日

cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation

Arxiv

1+阅读 · 2022年6月9日

Universal Speech Enhancement with Score-based Diffusion

Arxiv

0+阅读 · 2022年6月7日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Arxiv

16+阅读 · 2017年11月20日

VIP会员

文章信息

相关主题

语言模型化

知识 (knowledge)

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【CCL 2019】如何微调BERT进行文本分类？（How to Fine-Tune BERT for Text Classification?）

【CCL 2019】如何微调BERT进行文本分类？（How to Fine-Tune BERT for Text Classification?）

专知会员服务

84+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军徒步机动作战条令手册》最新168页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

军事后勤数字化未来展望

《美海军后勤体系整合与创新挑战》最新报告

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Revisiting End-to-End Speech-to-Text Translation From Scratch

Arxiv

0+阅读 · 2022年6月9日

cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation

Arxiv

1+阅读 · 2022年6月9日

Universal Speech Enhancement with Score-based Diffusion

Arxiv

0+阅读 · 2022年6月7日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Arxiv

16+阅读 · 2017年11月20日

相关基金

小系统中的反常扩散和Kramers逃逸问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

多复变函数空间上(加权)复合算子的动力学性质

国家自然科学基金

1+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

含第三组元共晶合金定向凝固中生长界面形貌的演化

国家自然科学基金

0+阅读 · 2012年12月31日

9.4T下猴脊髓损伤后大脑结构可塑性的磁共振成像研究

国家自然科学基金

0+阅读 · 2012年12月31日

ECAP中fcc织构转变规律及其与微观组织演化的交互作用

国家自然科学基金

0+阅读 · 2012年12月31日

两类Monge-Ampere方程问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

微合金化Al-Cu-Sc合金的多重强化和多尺度断裂行为研究

国家自然科学基金

0+阅读 · 2011年12月31日

量子discord及其在量子计算中的研究

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员