基于可调和级地物采掘的实时语音情感承认 (Real-time Speech Emotion Recognition Based on Syllable-Level Feature Extraction) - 专知论文

会员服务 ·

0

特征提取 · 讲稿 · 分解 · 容差 · state-of-the-art ·

2022 年 4 月 26 日

Real-time Speech Emotion Recognition Based on Syllable-Level Feature Extraction

翻译：基于可调和级地物采掘的实时语音情感承认

Abdul Rehman,Zhen-Tao Liu,Min Wu,Wei-Hua Cao,Cheng-Shan Jiang

from arxiv, 13 pages, 5 figures, currently under review by IEEE/ACM Transactions on Audio, Speech, and Language Processing

Speech emotion recognition systems have high prediction latency because of the high computational requirements for deep learning models and low generalizability mainly because of the poor reliability of emotional measurements across multiple corpora. To solve these problems, we present a speech emotion recognition system based on a reductionist approach of decomposing and analyzing syllable-level features. Mel-spectrogram of an audio stream is decomposed into syllable-level components, which are then analyzed to extract statistical features. The proposed method uses formant attention, noise-gate filtering, and rolling normalization contexts to increase feature processing speed and tolerance to adversity. A set of syllable-level formant features is extracted and fed into a single hidden layer neural network that makes predictions for each syllable as opposed to the conventional approach of using a sophisticated deep learner to make sentence-wide predictions. The syllable level predictions help to achieve the real-time latency and lower the aggregated error in utterance level cross-corpus predictions. The experiments on IEMOCAP (IE), MSP-Improv (MI), and RAVDESS (RA) databases show that the method archives real-time latency while predicting with state-of-the-art cross-corpus unweighted accuracy of 47.6% for IE to MI and 56.2% for MI to IE.

翻译：由于深层次学习模型的计算要求很高,而且由于多个公司情感测量的可靠性差,普遍程度低,因此语音识别系统具有较高的预测延迟度,因为深层次学习模型的计算要求很高,而且由于多层次公司情感测量的可靠性差,因此一般程度较低。为了解决这些问题,我们展示了一种语言情绪识别系统,其基础是分解和分析可分级特性的减少主义方法。音频流的Mel-spectrogrogram被分解成可分流成可调级的元件,随后对这些元件进行分析,以提取统计特征。拟议方法利用形成关注、噪音过滤和滚动的正常化环境环境来提高特征处理速度和逆差的耐受度。一套可调制级成功能被提取并输入到一个单一的隐蔽层神经网络,使每种音频调的预测与使用精密的深层次学习器进行全句预测的传统方法不同。可调级的级别预测有助于实现实时的延时性,并减少超临界水平跨整体预测的跨整体预测。在IMEC(IE)、IM-ISP-ISG-I-I-I-I-I)和RE-R-VI-VI-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-

0

相关内容

特征提取

特征提取是计算机视觉和图像处理中的一个概念。它指的是使用计算机提取图像信息，决定每个图像的点是否属于一个图像特征。特征被检测后它可以从图像中被抽取出来。这个过程可能需要许多图像处理的计算机。其结果被称为特征描述或者特征向量。

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

改进最大似然译码错误概率上界的新方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于空间调制的体域网低功耗无线通信技术

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

MICU1调控线粒体钙离子摄取的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

AD模型海马神经元AMPK-SIRT1-PGC-1α通路变化及电针的干预作用

国家自然科学基金

0+阅读 · 2011年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

AlGaN/GaN异质结场效应晶体管中与AlGaN势垒层应变分布相关的载流子散射机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于动力分析的高速重载行星齿轮传动轻量化设计

国家自然科学基金

0+阅读 · 2009年12月31日

基于双路光相位调制光学倍频法的毫米波Radio Over Fiber系统研究

国家自然科学基金

0+阅读 · 2008年12月31日

单分子运动学的分子动力学计算与系统生物学方法

国家自然科学基金

0+阅读 · 2008年12月31日

Learning a Degradation-Adaptive Network for Light Field Image Super-Resolution

Arxiv

0+阅读 · 2022年6月13日

Efficient Human-in-the-loop System for Guiding DNNs Attention

Arxiv

0+阅读 · 2022年6月13日

Fusing Feature Engineering and Deep Learning: A Case Study for Malware Classification

Arxiv

0+阅读 · 2022年6月12日

AttX: Attentive Cross-Connections for Fusion of Wearable Signals in Emotion Recognition

Arxiv

0+阅读 · 2022年6月9日

ECLAD: Extracting Concepts with Local Aggregated Descriptors

Arxiv

0+阅读 · 2022年6月9日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Deep Face Recognition: A Survey

Deep Face Recognition: A Survey

Arxiv

18+阅读 · 2019年2月12日

One for All: Neural Joint Modeling of Entities and Events

Arxiv

11+阅读 · 2018年12月1日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Learning a Degradation-Adaptive Network for Light Field Image Super-Resolution

Arxiv

0+阅读 · 2022年6月13日

Efficient Human-in-the-loop System for Guiding DNNs Attention

Arxiv

0+阅读 · 2022年6月13日

Fusing Feature Engineering and Deep Learning: A Case Study for Malware Classification

Arxiv

0+阅读 · 2022年6月12日

AttX: Attentive Cross-Connections for Fusion of Wearable Signals in Emotion Recognition

Arxiv

0+阅读 · 2022年6月9日

ECLAD: Extracting Concepts with Local Aggregated Descriptors

Arxiv

0+阅读 · 2022年6月9日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Deep Face Recognition: A Survey

Deep Face Recognition: A Survey

Arxiv

18+阅读 · 2019年2月12日

One for All: Neural Joint Modeling of Entities and Events

Arxiv

11+阅读 · 2018年12月1日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

A Robust Real-Time Automatic License Plate Recognition based on the YOLO Detector

Arxiv

13+阅读 · 2018年3月1日

相关基金

改进最大似然译码错误概率上界的新方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于空间调制的体域网低功耗无线通信技术

国家自然科学基金

0+阅读 · 2013年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

MICU1调控线粒体钙离子摄取的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

AD模型海马神经元AMPK-SIRT1-PGC-1α通路变化及电针的干预作用

国家自然科学基金

0+阅读 · 2011年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

AlGaN/GaN异质结场效应晶体管中与AlGaN势垒层应变分布相关的载流子散射机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于动力分析的高速重载行星齿轮传动轻量化设计

国家自然科学基金

0+阅读 · 2009年12月31日

基于双路光相位调制光学倍频法的毫米波Radio Over Fiber系统研究

国家自然科学基金

0+阅读 · 2008年12月31日

单分子运动学的分子动力学计算与系统生物学方法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员