Wav2vec 2. 0和BERT的多层次融合,促进多模式情感识别 (Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion Recognition) - 专知论文

会员服务 ·

0

多峰值 · MoDELS · BERT · Learning · state-of-the-art ·

2022 年 7 月 11 日

Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion Recognition

翻译：Wav2vec 2. 0和BERT的多层次融合,促进多模式情感识别

Zihan Zhao,Yanfeng Wang,Yu Wang

from arxiv, Accepted to INTERSPEECH 2022

The research and applications of multimodal emotion recognition have become increasingly popular recently. However, multimodal emotion recognition faces the challenge of lack of data. To solve this problem, we propose to use transfer learning which leverages state-of-the-art pre-trained models including wav2vec 2.0 and BERT for this task. Multi-level fusion approaches including coattention-based early fusion and late fusion with the models trained on both embeddings are explored. Also, a multi-granularity framework which extracts not only frame-level speech embeddings but also segment-level embeddings including phone, syllable and word-level speech embeddings is proposed to further boost the performance. By combining our coattention-based early fusion model and late fusion model with the multi-granularity feature extraction framework, we obtain result that outperforms best baseline approaches by 1.3% unweighted accuracy (UA) on the IEMOCAP dataset.

翻译：最近,多式联运情感识别的研究和应用最近越来越受欢迎。然而,多式联运情感识别面临缺乏数据的挑战。为了解决这一问题,我们提议利用转让学习,利用先进的预先培训模型,包括 wav2vec 2.0 和 BERT 来完成这项任务。多级聚合方法,包括以涂层为基础的早期聚变和与在两种嵌入方面受过训练的模型的延迟融合。此外,还探索了多级混合方法,不仅提取框架级语言嵌入,而且还提取包括电话、可调和字级语音嵌入在内的分层层嵌入,以进一步提高性能。通过将我们基于涂层的早期聚变模型和迟聚变模型与多级特征提取框架相结合,我们取得了以下结果:通过1.3%的未加权精确度(UA)在IEMOCAP数据集上将最佳基线方法排出1.3%。

0

相关内容

多峰值

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

高血压患者Corin基因变异对其蛋白结构及酶功能影响的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Ti-Ag 纳米管/HA 复合涂层的抗菌性能及生物相容性的基础研究

国家自然科学基金

0+阅读 · 2015年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

糖肾方对糖尿病肾病蛋白聚糖介导肾脏脂质沉积的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

碳化硅纳米点阵诱导高纯度多晶致密碳化硅陶瓷的制备与性能优化

国家自然科学基金

0+阅读 · 2012年12月31日

液相还原法制备Heusler合金纳米颗粒及其结构和性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

铁基超导体AFe2-ySe2(A = K,Tl,Cs,Rb)的核磁共振和缪子自旋共振研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cu基类金刚石结构新型热电化合物的设计与优化

国家自然科学基金

0+阅读 · 2012年12月31日

丙酮丁醇梭菌HtrA蛋白介导的丁醇耐受性机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

Contrastive Learning for Cross-Domain Open World Recognition

Arxiv

0+阅读 · 2022年9月2日

Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods

Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods

Arxiv

1+阅读 · 2022年9月1日

Contextualized language models for semantic change detection: lessons learned

Arxiv

0+阅读 · 2022年8月31日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

18+阅读 · 2021年6月17日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition

Arxiv

14+阅读 · 2018年1月24日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

卫星导航技术发展综述

《美军"僚机"联合能力技术演示项目：有人-无人火炮作战》41页报告

美军条令《火力指挥》116页

可解释的人工智能在生物医学图像分析中的应用综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

Contrastive Learning for Cross-Domain Open World Recognition

Arxiv

0+阅读 · 2022年9月2日

Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods

Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods

Arxiv

1+阅读 · 2022年9月1日

Contextualized language models for semantic change detection: lessons learned

Arxiv

0+阅读 · 2022年8月31日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

18+阅读 · 2021年6月17日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition

Arxiv

14+阅读 · 2018年1月24日

相关基金

高血压患者Corin基因变异对其蛋白结构及酶功能影响的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Ti-Ag 纳米管/HA 复合涂层的抗菌性能及生物相容性的基础研究

国家自然科学基金

0+阅读 · 2015年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

糖肾方对糖尿病肾病蛋白聚糖介导肾脏脂质沉积的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

碳化硅纳米点阵诱导高纯度多晶致密碳化硅陶瓷的制备与性能优化

国家自然科学基金

0+阅读 · 2012年12月31日

液相还原法制备Heusler合金纳米颗粒及其结构和性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

铁基超导体AFe2-ySe2(A = K,Tl,Cs,Rb)的核磁共振和缪子自旋共振研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cu基类金刚石结构新型热电化合物的设计与优化

国家自然科学基金

0+阅读 · 2012年12月31日

丙酮丁醇梭菌HtrA蛋白介导的丁醇耐受性机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员