In the last few years, emotion detection in social-media text has become a popular problem due to its wide ranging application in better understanding the consumers, in psychology, in aiding human interaction with computers, designing smart systems etc. Because of the availability of huge amounts of data from social-media, which is regularly used for expressing sentiments and opinions, this problem has garnered great attention. In this paper, we present a Hinglish dataset labelled for emotion detection. We highlight a deep learning based approach for detecting emotions in Hindi-English code mixed tweets, using bilingual word embeddings derived from FastText and Word2Vec approaches, as well as transformer based models. We experiment with various deep learning models, including CNNs, LSTMs, Bi-directional LSTMs (with and without attention), along with transformers like BERT, RoBERTa, and ALBERT. The transformer based BERT model outperforms all other models giving the best performance with an accuracy of 71.43%.
翻译:在过去几年里,社交媒体文本中的情绪检测已成为一个流行问题,因为其广泛应用是为了更好地了解消费者、心理学、协助人类与计算机的互动、设计智能系统等等。 由于社会媒体提供大量数据,经常用于表达情感和意见,这一问题引起了人们的极大关注。在本文中,我们提出了一个Hinglish数据集贴上情感检测标签。我们强调一种基于深层次学习的方法,用印地语-英语代码混合推文探测情感,使用来自快速Text和Word2Vec方法的双语嵌入词以及基于变异器的模型。我们实验了各种深层次学习模型,包括CNN、LSTMS、双向LSTMS(不留意和不留意)以及BERT、ROBERTA和ALBERT等变压器。基于变压器的BERT模型超越了所有其他模型,使最佳性能达到71.43%的精确度。