基于领域知识声学特征的乌尔都语语音情感识别跨语料库验证 (Cross-Corpus Validation of Speech Emotion Recognition in Urdu using Domain-Knowledge Acoustic Features)

Speech Emotion Recognition (SER) is a key affective computing technology that enables emotionally intelligent artificial intelligence. While SER is challenging in general, it is particularly difficult for low-resource languages such as Urdu. This study investigates Urdu SER in a cross-corpus setting, an area that has remained largely unexplored. We employ a cross-corpus evaluation framework across three different Urdu emotional speech datasets to test model generalization. Two standard domain-knowledge based acoustic feature sets, eGeMAPS and ComParE, are used to represent speech signals as feature vectors which are then passed to Logistic Regression and Multilayer Perceptron classifiers. Classification performance is assessed using unweighted average recall (UAR) whilst considering class-label imbalance. Results show that Self-corpus validation often overestimates performance, with UAR exceeding cross-corpus evaluation by up to 13%, underscoring that cross-corpus evaluation offers a more realistic measure of model robustness. Overall, this work emphasizes the importance of cross-corpus validation for Urdu SER and its implications contribute to advancing affective computing research for underrepresented language communities.

翻译：语音情感识别（SER）是一项关键的情感计算技术，能够赋予人工智能情感智能。尽管语音情感识别在一般情况下已具挑战性，对于乌尔都语等低资源语言而言尤为困难。本研究在跨语料库场景下探索乌尔都语语音情感识别，该领域目前尚未得到充分研究。我们采用跨语料库评估框架，在三个不同的乌尔都语情感语音数据集上测试模型的泛化能力。使用两种基于领域知识的标准化声学特征集——eGeMAPS和ComParE——将语音信号表示为特征向量，随后输入逻辑回归和多层感知机分类器。在考虑类别标签不平衡的情况下，采用未加权平均召回率（UAR）评估分类性能。结果表明：自语料库验证往往会高估模型性能，其UAR较跨语料库评估最高可超出13%，这凸显了跨语料库评估能为模型鲁棒性提供更真实的度量。总体而言，本研究强调了跨语料库验证对乌尔都语语音情感识别的重要性，其研究成果有助于推动针对代表性不足语言社群的情感计算研究。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日