Emotion Classification based on text is a task with many applications which has received growing interest in recent years. This paper presents a preliminary study with the goal to help researchers and practitioners gain insight into relatively new datasets as well as emotion classification in general. We focus on three datasets that were recently presented in the related literature, and we explore the performance of traditional as well as state-of-the-art deep learning models in the presence of different characteristics in the data. We also explore the use of data augmentation in order to improve performance. Our experimental work shows that state-of-the-art models such as RoBERTa perform the best for all cases. We also provide observations and discussion that highlight the complexity of emotion classification in these datasets and test out the applicability of the models to actual social media posts we collected and labeled.
翻译:基于文字的情感分类是一项任务,许多应用近年来引起了越来越多的兴趣。本文介绍了一项初步研究,目的是帮助研究人员和从业者深入了解相对新的数据集和一般的情绪分类。我们侧重于最近相关文献中介绍的三个数据集,我们探讨了传统和最先进的深层次学习模型在数据存在不同特征的情况下的性能。我们还探索了数据增强的使用情况,以改进性能。我们的实验工作表明,诸如RoBERTA等最先进的模型对所有案例都最优秀。我们还提供了意见和讨论,强调了这些数据集中情感分类的复杂性,并测试了这些模型对我们所收集和贴标签的实际社会媒体职位的适用性。</s>