在Wav2vec 2. 0和BERT的多模式融合中使用辅助任务来识别多模式情感</s> (Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition) - 专知论文

会员服务 ·

0

多峰值 · BERT · 模型评估 · Networking · 模态 ·

2023 年 2 月 27 日

Using Auxiliary Tasks In Multimodal Fusion Of Wav2vec 2.0 And BERT For Multimodal Emotion Recognition

翻译：在Wav2vec 2. 0和BERT的多模式融合中使用辅助任务来识别多模式情感

Dekai Sun,Yancheng He,Jiqing Han

The lack of data and the difficulty of multimodal fusion have always been challenges for multimodal emotion recognition (MER). In this paper, we propose to use pretrained models as upstream network, wav2vec 2.0 for audio modality and BERT for text modality, and finetune them in downstream task of MER to cope with the lack of data. For the difficulty of multimodal fusion, we use a K-layer multi-head attention mechanism as a downstream fusion module. Starting from the MER task itself, we design two auxiliary tasks to alleviate the insufficient fusion between modalities and guide the network to capture and align emotion-related features. Compared to the previous state-of-the-art models, we achieve a better performance by 78.42% Weighted Accuracy (WA) and 79.71% Unweighted Accuracy (UA) on the IEMOCAP dataset.

翻译：缺乏数据和多式融合的困难一直是多式情感识别的挑战。在本文中,我们提议使用预先培训的模型作为上游网络,Wav2vec 2.0用于音频模式,BERT用于文本模式,并在MER下游任务中对其进行微调,以应对缺乏数据的问题。由于多式融合的困难,我们使用K-层次多头关注机制作为下游融合模块。从MER任务本身开始,我们设计了两项辅助任务,以缓解模式之间融合不足的情况,并指导网络捕捉和调整情感相关特征。与以往最先进的模型相比,我们在IEMOCAP数据集中实现了更好的业绩,即78.42%的加权准确度(WA)和79.71%的未加权准确性(UA)。</s>

0

相关内容

多峰值

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

离子注入合成In纳米颗粒在Al薄膜中超导性质的研究

国家自然科学基金

0+阅读 · 2015年12月31日

mPFC神经环路中突触结构重塑与慢性应激大鼠抑郁样行为的关系研究

国家自然科学基金

0+阅读 · 2015年12月31日

甲基苯丙胺依赖者奖赏环路影响风险决策的脑功能机制

国家自然科学基金

0+阅读 · 2015年12月31日

Fur调控霍乱弧菌生物膜形成和TCP合成的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

脊髓型颈椎病减压术后功能重塑的多模态磁共振研究

国家自然科学基金

0+阅读 · 2014年12月31日

真实和虚拟金钱奖赏下风险决策的神经机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向交互式情感计算的多模态信息融合建模研究

国家自然科学基金

1+阅读 · 2013年12月31日

臂旁核谷氨酸能神经元调控睡眠-觉醒周期的作用及神经环路机制

国家自然科学基金

0+阅读 · 2013年12月31日

行星齿轮传动系统故障诊断与动态可靠性评估研究

国家自然科学基金

2+阅读 · 2012年12月31日

利用多模态磁共振在体探寻遗忘型轻度认知障碍神经影像特征- - 三年随访研究

国家自然科学基金

0+阅读 · 2009年12月31日

Modality-based Factorization for Multimodal Fusion

Arxiv

0+阅读 · 2023年4月15日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Arxiv

10+阅读 · 2021年3月22日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

VIP会员

文章信息

相关主题

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

【ICML2025教程】生成式人工智能遇上强化学习

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

全球人工智能社会发展研究报告(2025)

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

Modality-based Factorization for Multimodal Fusion

Arxiv

0+阅读 · 2023年4月15日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Arxiv

10+阅读 · 2021年3月22日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

Arxiv

23+阅读 · 2020年3月7日

How to Fine-Tune BERT for Text Classification?

How to Fine-Tune BERT for Text Classification?

Arxiv

13+阅读 · 2019年5月14日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

相关基金

离子注入合成In纳米颗粒在Al薄膜中超导性质的研究

国家自然科学基金

0+阅读 · 2015年12月31日

mPFC神经环路中突触结构重塑与慢性应激大鼠抑郁样行为的关系研究

国家自然科学基金

0+阅读 · 2015年12月31日

甲基苯丙胺依赖者奖赏环路影响风险决策的脑功能机制

国家自然科学基金

0+阅读 · 2015年12月31日

Fur调控霍乱弧菌生物膜形成和TCP合成的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

脊髓型颈椎病减压术后功能重塑的多模态磁共振研究

国家自然科学基金

0+阅读 · 2014年12月31日

真实和虚拟金钱奖赏下风险决策的神经机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向交互式情感计算的多模态信息融合建模研究

国家自然科学基金

1+阅读 · 2013年12月31日

臂旁核谷氨酸能神经元调控睡眠-觉醒周期的作用及神经环路机制

国家自然科学基金

0+阅读 · 2013年12月31日

行星齿轮传动系统故障诊断与动态可靠性评估研究

国家自然科学基金

2+阅读 · 2012年12月31日

利用多模态磁共振在体探寻遗忘型轻度认知障碍神经影像特征- - 三年随访研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员