如何承认言论是情感情感的----基于信息分解的研究 (How Speech is Recognized to Be Emotional - A Study Based on Information Decomposition) - 专知论文

会员服务 ·

0

INFORMS · 分解的 · Performer · MoDELS · RNN ·

2021 年 11 月 24 日

How Speech is Recognized to Be Emotional - A Study Based on Information Decomposition

翻译：如何承认言论是情感情感的----基于信息分解的研究

Haoran Sun,Lantian Li,Thomas Fang Zheng,Dong Wang

The way that humans encode their emotion into speech signals is complex. For instance, an angry man may increase his pitch and speaking rate, and use impolite words. In this paper, we present a preliminary study on various emotional factors and investigate how each of them impacts modern emotion recognition systems. The key tool of our study is the SpeechFlow model presented recently, by which we are able to decompose speech signals into separate information factors (content, pitch, rhythm). Based on this decomposition, we carefully studied the performance of each information component and their combinations. We conducted the study on three different speech emotion corpora and chose an attention-based convolutional RNN as the emotion classifier. Our results show that rhythm is the most important component for emotional expression. Moreover, the cross-corpus results are very bad (even worse than guess), demonstrating that the present speech emotion recognition model is rather weak. Interestingly, by removing one or several unimportant components, the cross-corpus results can be improved. This demonstrates the potential of the decomposition approach towards a generalizable emotion recognition.

翻译：人类将情感融入言语信号的方式是复杂的。例如,愤怒的人可能会增加他的声调和说话速度,并使用不礼貌的言词。在本文中,我们提出一份关于各种情感因素的初步研究报告,并调查其中每一种因素如何影响现代情感识别系统。我们研究的关键工具是最近推出的SpeeFlow模型,通过该模型,我们可以将言语信号分解成不同的信息因素(调、调、节奏) 。基于这一分解,我们仔细研究了每个信息组成部分的性能及其组合。我们研究了三种不同的言语情感,并选择了一个关注的共振RNN作为情感分类师。我们的结果显示,节奏是情感表达的最重要组成部分。此外,跨体的结果非常糟糕(甚至比猜测更糟 ), 表明目前的言语识别模型相当弱。有趣的是, 通过删除一个或几个无关重要的组成部分,跨体的结果可以改进。这显示了解调方法对于普遍情感识别的潜力。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NeurlPS2019论文总结】它是这样的:用于可解释图像识别的深度学习，This Looks Like That: Deep Learning for Interpretable Image Recognition

【NeurlPS2019论文总结】它是这样的:用于可解释图像识别的深度学习，This Looks Like That: Deep Learning for Interpretable Image Recognition

专知会员服务

22+阅读 · 2019年12月17日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

专知会员服务

16+阅读 · 2019年10月2日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

已删除

将门创投

5+阅读 · 2018年11月15日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

专知

14+阅读 · 2018年2月4日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

Neural-FST Class Language Model for End-to-End Speech Recognition

Neural-FST Class Language Model for End-to-End Speech Recognition

Arxiv

0+阅读 · 2022年1月31日

Point-of-Interest Recommender Systems based on Location-Based Social Networks: A Survey from an Experimental Perspective

Arxiv

0+阅读 · 2022年1月27日

Multi-View Self-Attention Based Transformer for Speaker Recognition

Arxiv

0+阅读 · 2022年1月27日

The ABBE Corpus: Animate Beings Being Emotional

Arxiv

0+阅读 · 2022年1月25日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

DeepFakes: a New Threat to Face Recognition? Assessment and Detection

Arxiv

6+阅读 · 2018年12月20日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

Leveraging Social Signal to Improve Item Recommendation for Matrix Factorization

Arxiv

6+阅读 · 2018年5月17日

Attention-based Group Recommendation

Arxiv

14+阅读 · 2018年4月18日

The Web as a Knowledge-base for Answering Complex Questions

Arxiv

5+阅读 · 2018年3月18日

VIP会员

文章信息

相关主题

相关VIP内容

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NeurlPS2019论文总结】它是这样的:用于可解释图像识别的深度学习，This Looks Like That: Deep Learning for Interpretable Image Recognition

【NeurlPS2019论文总结】它是这样的:用于可解释图像识别的深度学习，This Looks Like That: Deep Learning for Interpretable Image Recognition

专知会员服务

22+阅读 · 2019年12月17日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

【报告推荐】线上食品推荐中的数据分析（Computational Data Analytics on the Web for Better Food Decision Making）

专知会员服务

16+阅读 · 2019年10月2日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维和高维空间中分析、建模和转换潜在表征

从无人机到数据：揭示边缘计算作为新作战域

可解释人工智能的基础

大规模视觉模型中的基于提示的适应：综述

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

已删除

将门创投

5+阅读 · 2018年11月15日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

【论文推荐】最新5篇语音识别（ASR）相关论文—音频对抗样本、对抗性语音识别系统、声学模型、序列到序列、口语可理解性矫正

专知

14+阅读 · 2018年2月4日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Neural-FST Class Language Model for End-to-End Speech Recognition

Neural-FST Class Language Model for End-to-End Speech Recognition

Arxiv

0+阅读 · 2022年1月31日

Point-of-Interest Recommender Systems based on Location-Based Social Networks: A Survey from an Experimental Perspective

Arxiv

0+阅读 · 2022年1月27日

Multi-View Self-Attention Based Transformer for Speaker Recognition

Arxiv

0+阅读 · 2022年1月27日

The ABBE Corpus: Animate Beings Being Emotional

Arxiv

0+阅读 · 2022年1月25日

End-to-End Multi-speaker Speech Recognition with Transformer

Arxiv

8+阅读 · 2020年2月13日

DeepFakes: a New Threat to Face Recognition? Assessment and Detection

Arxiv

6+阅读 · 2018年12月20日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

Leveraging Social Signal to Improve Item Recommendation for Matrix Factorization

Arxiv

6+阅读 · 2018年5月17日

Attention-based Group Recommendation

Arxiv

14+阅读 · 2018年4月18日

The Web as a Knowledge-base for Answering Complex Questions

Arxiv

5+阅读 · 2018年3月18日

微信扫码咨询专知VIP会员