Emotion detection is an important task that can be applied to social media data to discover new knowledge. While the use of deep learning methods for this task has been prevalent, they are black-box models, making their decisions hard to interpret for a human operator. Therefore, in this paper, we propose an approach using weighted $k$ Nearest Neighbours (kNN), a simple, easy to implement, and explainable machine learning model. These qualities can help to enhance results' reliability and guide error analysis. In particular, we apply the weighted kNN model to the shared emotion detection task in tweets from SemEval-2018. Tweets are represented using different text embedding methods and emotion lexicon vocabulary scores, and classification is done by an ensemble of weighted kNN models. Our best approaches obtain results competitive with state-of-the-art solutions and open up a promising alternative path to neural network methods.
翻译:情感检测是一个重要的任务,可以应用于社交媒体数据,以发现新的知识。 虽然使用深层学习方法来完成这项任务是常见的,但它们是黑盒模型,使得其决定很难为人类操作者解释。 因此,在本文中,我们提出使用加权近邻(kNN)的方法,这是一个简单、容易执行和解释的机器学习模式。这些特性可以帮助提高成果的可靠性和指导错误分析。特别是,我们将加权的 kNN模型用于SemEval-2018的推文中共享的情感检测任务。 Tweets使用不同的文本嵌入方法和情感词汇评分,分类由加权的 kNNM 模型的组合进行。我们的最佳方法以最先进的解决方案获得有竞争力的成果,并为神经网络方法打开一条有希望的替代路径。