预先培训的模范代表及其反对声音情感分析的威力</s> (Pre-trained Model Representations and their Robustness against Noise for Speech Emotion Analysis)

Pre-trained model representations have demonstrated state-of-the-art performance in speech recognition, natural language processing, and other applications. Speech models, such as Bidirectional Encoder Representations from Transformers (BERT) and Hidden units BERT (HuBERT), have enabled generating lexical and acoustic representations to benefit speech recognition applications. We investigated the use of pre-trained model representations for estimating dimensional emotions, such as activation, valence, and dominance, from speech. We observed that while valence may rely heavily on lexical representations, activation and dominance rely mostly on acoustic information. In this work, we used multi-modal fusion representations from pre-trained models to generate state-of-the-art speech emotion estimation, and we showed a 100% and 30% relative improvement in concordance correlation coefficient (CCC) on valence estimation compared to standard acoustic and lexical baselines. Finally, we investigated the robustness of pre-trained model representations against noise and reverberation degradation and noticed that lexical and acoustic representations are impacted differently. We discovered that lexical representations are more robust to distortions compared to acoustic representations, and demonstrated that knowledge distillation from a multi-modal model helps to improve the noise-robustness of acoustic-based models.

翻译：在语音识别、自然语言处理和其他应用方面,经过培训的模型表现展示了最先进的语音识别、自然语言处理和其他应用表现。诸如来自变异器和隐藏单元BERT(HuBERT)的双向编码显示器等演讲模型模型模型模型表现方式,使得能够产生有利于语音识别应用的词汇和声学表现方式。我们调查了使用经过培训的模型表现方式来估计来自演讲的维度情绪,例如激活、valence和支配力。我们发现,虽然价值可能严重依赖词汇表达方式,但激活和主导力主要依赖声学信息。在这项工作中,我们使用来自预先培训的模型的多式聚合表示方式来生成最先进的语音情绪估计,而且我们显示,与标准的声学和词学基线相比,在一致性相关系数估算方面出现了100%和30%的相对改善。最后,我们调查了经过培训的模型表现方式对于噪音和重新校正退化的稳健性,我们发现,基于词汇和声学的表达方式的影响不同。我们发现,与声学表现方式的模型相比,我们发现,与声学表现方式相比,比较,比较,比较,比较好的模型比,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,我们改进的声学模型,比较,我们显示,比,比,比,我们,比较,比较,比,比,比较,比,比,改进了多式制制式制的压,比较,比较,我们,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较,比较</s>

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日