Crises such as natural disasters, global pandemics, and social unrest continuously threaten our world and emotionally affect millions of people worldwide in distinct ways. Understanding emotions that people express during large-scale crises helps inform policy makers and first responders about the emotional states of the population as well as provide emotional support to those who need such support. We present CovidEmo, ~3K English tweets labeled with emotions and temporally distributed across 18 months. Our analyses reveal the emotional toll caused by COVID-19, and changes of the social narrative and associated emotions over time. Motivated by the time-sensitive nature of crises and the cost of large-scale annotation efforts, we examine how well large pre-trained language models generalize across domains and timeline in the task of perceived emotion prediction in the context of COVID-19. Our analyses suggest that cross-domain information transfers occur, yet there are still significant gaps. We propose semi-supervised learning as a way to bridge this gap, obtaining significantly better performance using unlabeled data from the target domain.
翻译:理解人们在大规模危机期间表达的情绪,帮助决策者和第一反应者了解民众的情绪状态,并向需要这种支持的人提供情感支持。我们介绍了CovidEmo,~3K英语推特,上面贴有情感标签,在18个月中随时间流传。我们的分析揭示了COVID-19造成的情感损失,以及社会叙事和相关情感随时间推移的变化。受危机的时间敏感性质和大规模注解努力的代价的驱使,我们审视了在COVID-19背景下,经过培训的大规模语言模式在人们感知的情绪预测任务中,在各个领域和时间表上如何广泛推广。我们的分析表明,发生了跨领域的信息传输,但仍然存在着巨大的差距。我们建议半超能力学习,作为弥补这一差距的一个途径,利用目标领域未标注的数据取得更好的业绩。