四方网域内学习演说情感代表</s> (Learning Speech Emotion Representations in the Quaternion Domain) - 专知论文

会员服务 ·

0

Performer · Networking · MoDELS · 潜在 · Learning ·

2023 年 3 月 3 日

Learning Speech Emotion Representations in the Quaternion Domain

翻译：四方网域内学习演说情感代表

Eric Guizzo,Tillman Weyde,Simone Scardapane,Danilo Comminiello

from arxiv, Accepted for Publication in IEEE/ACM Transactions on Audio, Speech and Language Processing

The modeling of human emotion expression in speech signals is an important, yet challenging task. The high resource demand of speech emotion recognition models, combined with the the general scarcity of emotion-labelled data are obstacles to the development and application of effective solutions in this field. In this paper, we present an approach to jointly circumvent these difficulties. Our method, named RH-emo, is a novel semi-supervised architecture aimed at extracting quaternion embeddings from real-valued monoaural spectrograms, enabling the use of quaternion-valued networks for speech emotion recognition tasks. RH-emo is a hybrid real/quaternion autoencoder network that consists of a real-valued encoder in parallel to a real-valued emotion classifier and a quaternion-valued decoder. On the one hand, the classifier permits to optimize each latent axis of the embeddings for the classification of a specific emotion-related characteristic: valence, arousal, dominance and overall emotion. On the other hand, the quaternion reconstruction enables the latent dimension to develop intra-channel correlations that are required for an effective representation as a quaternion entity. We test our approach on speech emotion recognition tasks using four popular datasets: Iemocap, Ravdess, EmoDb and Tess, comparing the performance of three well-established real-valued CNN architectures (AlexNet, ResNet-50, VGG) and their quaternion-valued equivalent fed with the embeddings created with RH-emo. We obtain a consistent improvement in the test accuracy for all datasets, while drastically reducing the resources' demand of models. Moreover, we performed additional experiments and ablation studies that confirm the effectiveness of our approach. The RH-emo repository is available at: https://github.com/ispamm/rhemo.

翻译：在语音信号中模拟人类情感表达是一项重要但富有挑战性的任务。高资源需求对语音情绪识别模型的需求,加上情绪标签数据普遍稀缺,是该领域有效解决方案开发和应用的障碍。在本文中,我们提出了一个共同规避这些困难的方法。我们的方法叫做RH-em, 是一个全新的半监督架构, 目的是从真实价值单声波谱中提取嵌入的精细嵌入内容, 从而能够使用四位网位定值的网络情感识别网络。 RH- 情绪识别模型是一个混合的实值/ 夸脱性自动电算器网络网络网络网络网络网络网络网络, 该网络由真实价值的编码器组成, 与真实价值的情感分析器和缩略微读值的解码器平行。一方面, 精选器允许优化与情感相关特性分类的嵌入的每一个潜在轴心轴: 价值、振奋、振奋、压、支配力和总体情绪变现, 重建, 使潜值的内心等值形成等等等值关系, 需要我们不断测试的内置的内置的内流数据。</s>

0

相关内容

Performer

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

专知会员服务

33+阅读 · 2022年3月13日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

函数空间、几何和Mahler测度

国家自然科学基金

0+阅读 · 2014年12月31日

面向X-CT应用的(Ce, Lu)3(Cr, Al)5O12闪烁陶瓷中过渡金属离子的光谱展宽效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

利用同步辐射光电子能谱原位研究NiO-ZnO的界面电子态结构

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

拓扑动力系统与分形几何

国家自然科学基金

1+阅读 · 2012年12月31日

水稻ERF转录因子家族IX亚组成员在抗病性中的功能及其作用机制

国家自然科学基金

1+阅读 · 2012年12月31日

金属咔咯配合物的催化反应

国家自然科学基金

0+阅读 · 2012年12月31日

海啸所引发的大气重力波之传播特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

MgO(001)单晶势垒磁性隧道结的制备及其自旋相关器件的原理研究

国家自然科学基金

0+阅读 · 2009年12月31日

Deep Learning-empowered Predictive Precoder Design for OTFS Transmission in URLLC

Arxiv

0+阅读 · 2023年4月21日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Deep Long-Tailed Learning: A Survey

Arxiv

13+阅读 · 2021年10月9日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

Multimodal Intelligence: Representation Learning, Information Fusion, and Applications

Arxiv

78+阅读 · 2019年11月10日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Arxiv

14+阅读 · 2019年9月17日

Hierarchical Graph Representation Learning with Differentiable Pooling

Hierarchical Graph Representation Learning with Differentiable Pooling

Arxiv

14+阅读 · 2018年6月26日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

语音识别:不同深度学习方法的综述，Speech Recognition: a review of the different deep learning approaches

专知会员服务

33+阅读 · 2022年3月13日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Deep Learning-empowered Predictive Precoder Design for OTFS Transmission in URLLC

Arxiv

0+阅读 · 2023年4月21日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Deep Long-Tailed Learning: A Survey

Arxiv

13+阅读 · 2021年10月9日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

Multimodal Intelligence: Representation Learning, Information Fusion, and Applications

Arxiv

78+阅读 · 2019年11月10日

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Arxiv

14+阅读 · 2019年9月17日

Hierarchical Graph Representation Learning with Differentiable Pooling

Hierarchical Graph Representation Learning with Differentiable Pooling

Arxiv

14+阅读 · 2018年6月26日

相关基金

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

函数空间、几何和Mahler测度

国家自然科学基金

0+阅读 · 2014年12月31日

面向X-CT应用的(Ce, Lu)3(Cr, Al)5O12闪烁陶瓷中过渡金属离子的光谱展宽效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

利用同步辐射光电子能谱原位研究NiO-ZnO的界面电子态结构

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

拓扑动力系统与分形几何

国家自然科学基金

1+阅读 · 2012年12月31日

水稻ERF转录因子家族IX亚组成员在抗病性中的功能及其作用机制

国家自然科学基金

1+阅读 · 2012年12月31日

金属咔咯配合物的催化反应

国家自然科学基金

0+阅读 · 2012年12月31日

海啸所引发的大气重力波之传播特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

MgO(001)单晶势垒磁性隧道结的制备及其自旋相关器件的原理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员