Identifying named entities is, in general, a practical and challenging task in the field of Natural Language Processing. Named Entity Recognition on the code-mixed text is further challenging due to the linguistic complexity resulting from the nature of the mixing. This paper addresses the submission of team CMNEROne to the SEMEVAL 2022 shared task 11 MultiCoNER. The Code-mixed NER task aimed to identify named entities on the code-mixed dataset. Our work consists of Named Entity Recognition (NER) on the code-mixed dataset by leveraging the multilingual data. We achieved a weighted average F1 score of 0.7044, i.e., 6% greater than the baseline.
翻译:一般而言,确定被点名的实体是自然语言处理领域的一项实际和具有挑战性的任务。由于混合性质造成的语言复杂性,在编码混合文本上命名实体的识别进一步具有挑战性。本文件述及向2022年SEMEVAL的CMNEROI小组提交第11项共同任务“多伙伴”的问题。编码混合净化网的任务是在编码混合数据集中识别被点名的实体。我们的工作包括利用多语种数据,在编码混合数据集上命名实体的识别(NER)。我们达到了加权平均F1分0.7044,即比基线高出6%。