Vocal bursts play an important role in communicating affect, making them valuable for improving speech emotion recognition. Here, we present our approach for classifying vocal bursts and predicting their emotional significance in the ACII Affective Vocal Burst Workshop & Challenge 2022 (A-VB). We use a large self-supervised audio model as shared feature extractor and compare multiple architectures built on classifier chains and attention networks, combined with uncertainty loss weighting strategies. Our approach surpasses the challenge baseline by a wide margin on all four tasks.
翻译:Vocal Burrown在沟通影响方面起着重要作用,使得它们对于提高言语情感识别作用很有价值。在这里,我们在ACII Affective Vocal Burst 研讨会和2022挑战(A-VB)中展示了对声波爆发进行分类和预测其情感重要性的方法。我们使用一个大型自我监督的音频模型作为共享地物提取器,比较在分类链和关注网络上建立的多个结构,同时采用不确定性减重战略。我们的方法在所有四项任务上都大大超过挑战基线。