Code-mixing(CM) is a frequently observed phenomenon that uses multiple languages in an utterance or sentence. CM is mostly practiced on various social media platforms and in informal conversations. Sentiment analysis (SA) is a fundamental step in NLP and is well studied in the monolingual text. Code-mixing adds a challenge to sentiment analysis due to its non-standard representations. This paper proposes a meta embedding with a transformer method for sentiment analysis on the Dravidian code-mixed dataset. In our method, we used meta embeddings to capture rich text representations. We used the proposed method for the Task: "Sentiment Analysis for Dravidian Languages in Code-Mixed Text", and it achieved an F1 score of $0.58$ and $0.66$ for the given Dravidian code mixed data sets. The code is provided in the Github https://github.com/suman101112/fire-2020-Dravidian-CodeMix.
翻译:代码混合(CM)是一种经常观察到的现象,在发音或句子中使用多种语言。 CM主要在各种社交媒体平台和非正式对话中实践。感化分析(SA)是NLP的一个基本步骤,在单语文本中得到了很好的研究。代码混合(SSA)因其非标准表达方式,增加了情感分析的挑战。本文提议在Dravidian 代码混合数据集的情感分析中采用一个带有变压器的元化嵌入法。在我们的方法中,我们使用元嵌入法来捕捉丰富的文本表达。我们使用拟议的任务方法:“代码混合文本中的Dravidian语言动态分析 ”, 并实现了给Dravidian 代码混合数据集的F1分0.58美元和0.66美元。该代码在Github https://github.com/suman101/1112/firetlo-Draviidian-CodeMix中提供。