利用视听和文字的多种形式情感承认 (Contrastive Regularization for Multimodal Emotion Recognition Using Audio and Text) - 专知论文

会员服务 ·

0

多峰值 · 正则化项 · contrastive · 模型评估 · 相同 ·

2022 年 11 月 20 日

Contrastive Regularization for Multimodal Emotion Recognition Using Audio and Text

翻译：利用视听和文字的多种形式情感承认

Fan Qian,Jiqing Han

from arxiv, Completed in October 2020 and submitted to ICASSP2021

Speech emotion recognition is a challenge and an important step towards more natural human-computer interaction (HCI). The popular approach is multimodal emotion recognition based on model-level fusion, which means that the multimodal signals can be encoded to acquire embeddings, and then the embeddings are concatenated together for the final classification. However, due to the influence of noise or other factors, each modality does not always tend to the same emotional category, which affects the generalization of a model. In this paper, we propose a novel regularization method via contrastive learning for multimodal emotion recognition using audio and text. By introducing a discriminator to distinguish the difference between the same and different emotional pairs, we explicitly restrict the latent code of each modality to contain the same emotional information, so as to reduce the noise interference and get more discriminative representation. Experiments are performed on the standard IEMOCAP dataset for 4-class emotion recognition. The results show a significant improvement of 1.44\% and 1.53\% in terms of weighted accuracy (WA) and unweighted accuracy (UA) compared to the baseline system.

翻译：由于噪音或其他因素的影响,每种模式都不一定倾向于相同的情感类别,从而影响到一种模式的概括性。在本文件中,我们提议一种新型的正规化方法,通过对调学习,用音频和文字来区分多式情感识别。通过引入一个区分相同和不同情感伴侣的区别的区分器,我们明确限制每种模式的潜在代码,以包含相同的情感信息,从而减少噪音干扰,并获得更具有歧视性的表述。实验是在四级情绪识别的IEMOCAP标准数据集上进行的。结果显示,与基线系统相比,在加权精度(WA)和未加权精度(UA)方面,1.44 ⁇ 和1.53 ⁇ 有显著改进。

0

相关内容

多峰值

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

联合高光谱红外探测仪和光谱成像仪同步反演温湿廓线和云参数的研究

国家自然科学基金

0+阅读 · 2013年12月31日

FoxM1维持肌肉干细胞多能性及调控肌肉干细胞再生的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

太赫兹量子阱探测器偏振特性研究

国家自然科学基金

1+阅读 · 2013年12月31日

多尺度分析技术在图像引导肿瘤精确放疗中的研究及临床应用

国家自然科学基金

0+阅读 · 2012年12月31日

老年性EB病毒阳性DLBCL致病基因的筛选及克隆异质性研究

国家自然科学基金

0+阅读 · 2012年12月31日

非磁性元素掺杂稀磁半导体铁磁性机理研究的新方法

国家自然科学基金

0+阅读 · 2012年12月31日

TIP30核内化的分子机制及其与EGFR信号通路的相关性研究

国家自然科学基金

0+阅读 · 2012年12月31日

寒冷地区非点源污染的时空异质性分布规律与流域综合管理对策研究

国家自然科学基金

0+阅读 · 2011年12月31日

稻田土壤磷素的积累规律、预测模型与归宿研究

国家自然科学基金

0+阅读 · 2009年12月31日

Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition

Arxiv

0+阅读 · 2023年1月19日

Multimodal Video Adapter for Parameter Efficient Video Text Retrieval

Arxiv

0+阅读 · 2023年1月19日

Temporal Perceiving Video-Language Pre-training

Arxiv

0+阅读 · 2023年1月18日

Face Recognition in the age of CLIP & Billion image datasets

Arxiv

0+阅读 · 2023年1月18日

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Arxiv

28+阅读 · 2022年6月8日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition

Arxiv

0+阅读 · 2023年1月19日

Multimodal Video Adapter for Parameter Efficient Video Text Retrieval

Arxiv

0+阅读 · 2023年1月19日

Temporal Perceiving Video-Language Pre-training

Arxiv

0+阅读 · 2023年1月18日

Face Recognition in the age of CLIP & Billion image datasets

Arxiv

0+阅读 · 2023年1月18日

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Arxiv

28+阅读 · 2022年6月8日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

相关基金

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

联合高光谱红外探测仪和光谱成像仪同步反演温湿廓线和云参数的研究

国家自然科学基金

0+阅读 · 2013年12月31日

FoxM1维持肌肉干细胞多能性及调控肌肉干细胞再生的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

太赫兹量子阱探测器偏振特性研究

国家自然科学基金

1+阅读 · 2013年12月31日

多尺度分析技术在图像引导肿瘤精确放疗中的研究及临床应用

国家自然科学基金

0+阅读 · 2012年12月31日

老年性EB病毒阳性DLBCL致病基因的筛选及克隆异质性研究

国家自然科学基金

0+阅读 · 2012年12月31日

非磁性元素掺杂稀磁半导体铁磁性机理研究的新方法

国家自然科学基金

0+阅读 · 2012年12月31日

TIP30核内化的分子机制及其与EGFR信号通路的相关性研究

国家自然科学基金

0+阅读 · 2012年12月31日

寒冷地区非点源污染的时空异质性分布规律与流域综合管理对策研究

国家自然科学基金

0+阅读 · 2011年12月31日

稻田土壤磷素的积累规律、预测模型与归宿研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员