For speech emotion datasets, it has been difficult to acquire large quantities of reliable data and acted emotions may be over the top compared to less expressive emotions displayed in everyday life. Lately, larger datasets with natural emotions have been created. Instead of ignoring smaller, acted datasets, this study investigates whether information learnt from acted emotions is useful for detecting natural emotions. Cross-corpus research has mostly considered cross-lingual and even cross-age datasets, and difficulties arise from different methods of annotating emotions causing a drop in performance. To be consistent, four adult English datasets covering acted, elicited and natural emotions are considered. A state-of-the-art model is proposed to accurately investigate the degradation of performance. The system involves a bi-directional LSTM with an attention mechanism to classify emotions across datasets. Experiments study the effects of training models in a cross-corpus and multi-domain fashion and results show the transfer of information is not successful. Out-of-domain models, followed by adapting to the missing dataset, and domain adversarial training (DAT) are shown to be more suitable to generalising to emotions across datasets. This shows positive information transfer from acted datasets to those with more natural emotions and the benefits from training on different corpora.
翻译:对于言语情感数据集而言,很难获得大量可靠的数据和行动情感,而日常生活中表现的情绪则不那么显眼。最近,创建了更多的自然情感数据集。本研究不是忽视更小的动作数据集,而是调查从行为情感中获取的信息是否对探测自然情感有用。跨肉体研究大多考虑跨语言甚至跨年龄数据集,以及因调用情绪的不同方法导致性能下降而产生的困难。为了保持一致,将考虑四个涵盖行为、诱导和自然情感的成熟英国数据集。建议了一种最先进的模型来准确调查性能的退化。这个系统涉及双向LSTM, 其关注机制是将情感分解到各数据集中。实验研究跨组合和多层次时装的培训模型的效果,结果显示信息转移不成功。外部模型,随后调整到缺失的数据集,然后是四个成熟的英语数据集,以及域对立式培训(DAT)的模型来准确调查性能的退化。这个系统涉及一个双向LSTM, 其关注机制用来对各种情绪进行分类。实验研究模型的效果显示, 从一个跨跨跨组合和多层次的数据进行更适合的动作。