In this paper, we present our solutions for the Multimodal Sentiment Analysis Challenge (MuSe) 2022, which includes MuSe-Humor, MuSe-Reaction and MuSe-Stress Sub-challenges. The MuSe 2022 focuses on humor detection, emotional reactions and multimodal emotional stress utilizing different modalities and data sets. In our work, different kinds of multimodal features are extracted, including acoustic, visual, text and biological features. These features are fused by TEMMA and GRU with self-attention mechanism frameworks. In this paper, 1) several new audio features, facial expression features and paragraph-level text embeddings are extracted for accuracy improvement. 2) we substantially improve the accuracy and reliability of multimodal sentiment prediction by mining and blending the multimodal features. 3) effective data augmentation strategies are applied in model training to alleviate the problem of sample imbalance and prevent the model from learning biased subject characters. For the MuSe-Humor sub-challenge, our model obtains the AUC score of 0.8932. For the MuSe-Reaction sub-challenge, the Pearson's Correlations Coefficient of our approach on the test set is 0.3879, which outperforms all other participants. For the MuSe-Stress sub-challenge, our approach outperforms the baseline in both arousal and valence on the test dataset, reaching a final combined result of 0.5151.
翻译:在本文中,我们介绍了2022年多式联运分析挑战(MuSe)的解决方案,其中包括MuSe-Humor、MuSe-Reaction和MuSe-Stress Subchallenges;2022年的MuSe侧重于幽默检测、情感反应和多式情绪压力,利用不同的方式和数据集;在我们的工作中,提取了不同类型的多式联运特征,包括声学、视觉、文字和生物特征;这些特征由TMMA和GRU与自我注意机制框架相结合;本文中,1)为精确性改进提取了若干新的音频特征、面部表达特征和段落级文字嵌入。2)我们通过采矿和混合多式联运特征,大幅提高多式联运情绪预测的准确性和可靠性。3)在模型培训中采用了有效的数据增强战略,以缓解样本不平衡问题,防止模型学习偏差的主题字符。关于MuSe-Humbera次挑战,我们的模型获得AUC的分数为0.8932;关于MuSe-Rection 子表达特征和段落级组合方法,所有测试参与者在303-S格式的测试方法中,所有测试结果均标为S-Corration。