使用新颖胶囊网模式识别情感扩音器 (Emotional Speaker Identification using a Novel Capsule Nets Model) - 专知论文

会员服务 ·

0

可辨认的 · Capsule · CapsNet · MoDELS · Performer ·

2022 年 1 月 9 日

Emotional Speaker Identification using a Novel Capsule Nets Model

翻译：使用新颖胶囊网模式识别情感扩音器

Ali Bou Nassif,Ismail Shahin,Ashraf Elnagar,Divya Velayudhan,Adi Alhudhaif,Kemal Polat

from arxiv, 11 pages, 8 figures

Speaker recognition systems are widely used in various applications to identify a person by their voice; however, the high degree of variability in speech signals makes this a challenging task. Dealing with emotional variations is very difficult because emotions alter the voice characteristics of a person; thus, the acoustic features differ from those used to train models in a neutral environment. Therefore, speaker recognition models trained on neutral speech fail to correctly identify speakers under emotional stress. Although considerable advancements in speaker identification have been made using convolutional neural networks (CNN), CNNs cannot exploit the spatial association between low-level features. Inspired by the recent introduction of capsule networks (CapsNets), which are based on deep learning to overcome the inadequacy of CNNs in preserving the pose relationship between low-level features with their pooling technique, this study investigates the performance of using CapsNets in identifying speakers from emotional speech recordings. A CapsNet-based speaker identification model is proposed and evaluated using three distinct speech databases, i.e., the Emirati Speech Database, SUSAS Dataset, and RAVDESS (open-access). The proposed model is also compared to baseline systems. Experimental results demonstrate that the novel proposed CapsNet model trains faster and provides better results over current state-of-the-art schemes. The effect of the routing algorithm on speaker identification performance was also studied by varying the number of iterations, both with and without a decoder network.

翻译：发言人识别系统被广泛用于各种应用中,通过声音识别某人;然而,由于语音信号的高度差异性,这是一项具有挑战性的任务。处理情绪变化非常困难,因为情感变化会改变一个人的语音特征;因此,声学特征不同于在中立环境中培训模型所用的声音特征;因此,通过中性语言培训的发言者识别模式无法正确识别情绪紧张的发言者。虽然使用超动性神经网络(CNN),有线电视新闻网无法利用低级别功能之间的空间联系。最近引入的胶囊网络(CapsNets)激发了这一挑战性任务。基于深度学习的胶囊网络(CapsNets)克服了CNNs在维护低级别特征与集中技术之间的面貌关系方面的不足,因此,本项研究调查了使用CapsNet识别情绪性演讲者在情感压力下的表现。基于CapsNet的语音识别模式是利用三个不同的语音数据库(e.e),Amirati Session 数据库、SASAS Dataset 和REVDESS (可自由访问) 的拟议模型与基线系统比较。实验性结果还用新式的模型研究了新式网络的模型和数字。

0

相关内容

可辨认的

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

【论文推荐】最新七篇图像分类相关论文—条件标签空间、生成对抗胶囊网络、深度预测编码网络、生成对抗网络、数字病理图像、在线表示学习

【论文推荐】最新七篇图像分类相关论文—条件标签空间、生成对抗胶囊网络、深度预测编码网络、生成对抗网络、数字病理图像、在线表示学习

专知

17+阅读 · 2018年3月3日

无键相和压力信号条件下往复压缩机示功图故障特征提取方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

北极海冰假交替单胞菌属细菌的多样性、系统分类及生态适应的遗传与生理基础

国家自然科学基金

0+阅读 · 2014年12月31日

语音缺失频谱重建及语音频谱二维相关性建模的研究

国家自然科学基金

0+阅读 · 2012年12月31日

智慧协同的绿色移动流媒体服务研究

国家自然科学基金

2+阅读 · 2012年12月31日

内蒙古木本固氮植物共生固氮菌多样性分析

国家自然科学基金

0+阅读 · 2012年12月31日

数据驱动的彩色图像颜色空间建模与去噪

国家自然科学基金

1+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

东祁连山高寒草地禾草内生细菌的生物功能及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

频域转换提高人工耳蜗植入者语音识别作用的试验研究

国家自然科学基金

0+阅读 · 2008年12月31日

Improving Self-Supervised Speech Representations by Disentangling Speakers

Arxiv

1+阅读 · 2022年4月20日

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

Arxiv

0+阅读 · 2022年4月18日

Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking

Arxiv

0+阅读 · 2022年4月15日

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Arxiv

0+阅读 · 2022年4月15日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

A Survey on Knowledge Graph-Based Recommender Systems

Arxiv

92+阅读 · 2020年2月28日

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks

Arxiv

11+阅读 · 2019年9月8日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

最新《扩散模型原理》新书，470页pdf

无人机作战：演进、创新与未来战场

AI 智能体简史

多模态空间推理在大模型时代：综述与基准测试

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

【论文推荐】最新七篇图像分类相关论文—条件标签空间、生成对抗胶囊网络、深度预测编码网络、生成对抗网络、数字病理图像、在线表示学习

【论文推荐】最新七篇图像分类相关论文—条件标签空间、生成对抗胶囊网络、深度预测编码网络、生成对抗网络、数字病理图像、在线表示学习

专知

17+阅读 · 2018年3月3日

相关论文

Improving Self-Supervised Speech Representations by Disentangling Speakers

Arxiv

1+阅读 · 2022年4月20日

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

Arxiv

0+阅读 · 2022年4月18日

Scalable and Real-time Multi-Camera Vehicle Detection, Re-Identification, and Tracking

Arxiv

0+阅读 · 2022年4月15日

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Arxiv

0+阅读 · 2022年4月15日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

A Survey on Knowledge Graph-Based Recommender Systems

Arxiv

92+阅读 · 2020年2月28日

Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks

Arxiv

11+阅读 · 2019年9月8日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

相关基金

无键相和压力信号条件下往复压缩机示功图故障特征提取方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

北极海冰假交替单胞菌属细菌的多样性、系统分类及生态适应的遗传与生理基础

国家自然科学基金

0+阅读 · 2014年12月31日

语音缺失频谱重建及语音频谱二维相关性建模的研究

国家自然科学基金

0+阅读 · 2012年12月31日

智慧协同的绿色移动流媒体服务研究

国家自然科学基金

2+阅读 · 2012年12月31日

内蒙古木本固氮植物共生固氮菌多样性分析

国家自然科学基金

0+阅读 · 2012年12月31日

数据驱动的彩色图像颜色空间建模与去噪

国家自然科学基金

1+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

东祁连山高寒草地禾草内生细菌的生物功能及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

频域转换提高人工耳蜗植入者语音识别作用的试验研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员