Sentiment Analysis for African Languages (AfriSenti-SemEval): SemEval-2023任务12 (SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval))

Shamsuddeen Hassan Muhammad,Idris Abdulmumin,Seid Muhie Yimam,David Ifeoluwa Adelani,Ibrahim Sa'id Ahmad,Nedjma Ousidhoum,Abinew Ayele,Saif M. Mohammad,Meriem Beloucif

We present the first Africentric SemEval Shared task, Sentiment Analysis for African Languages (AfriSenti-SemEval) - the dataset is available at https://github.com/afrisenti-semeval/afrisent-semeval-2023. AfriSenti-SemEval is a sentiment classification challenge in 14 African languages - Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and Yor\`ub\'a (Muhammad et al., 2023), using a 3-class labeled data: positive, negative, and neutral. We present three subtasks: (1) Task A: monolingual classification, which received 44 submissions; (2) Task B: multilingual classification, which received 32 submissions; and (3) Task C: zero-shot classification, which received 34 submissions. The best system for tasks A and B was achieved by NLNDE team with 71.31 and 75.06 weighted F1, respectively. UCAS-IIE-NLP achieved the best system on average for task C with 58.15 weighted F1. We describe the various approaches adopted by the top 10 systems and their approaches.

翻译：我们提出了第一个面向非洲的SemEval任务，即面向非洲语言的情感分析（AfriSenti-SemEval）-数据集可在https://github.com/afrisenti-semeval/afrisent-semeval-2023获得。AfriSenti-SemEval是一个情感分类挑战，使用14种非洲语言-阿姆哈拉语、阿尔及利亚阿拉伯语、豪萨语、伊博语、基尼亚鲁旺达语、摩洛哥阿拉伯语、莫桑比克葡萄牙语、尼日利亚皮钦语、奥罗莫语、斯瓦希里语、提格雷尼亚语、特威语、希茨宋加语和约鲁巴语（Muhammad等人，2023），使用3类标记数据进行分类：积极的、消极的和中性的。我们提出了三个子任务：（1）任务A:单语分类，共收到44个提交；（2）任务B: 多语言分类，共收到32个提交；及（3）任务C:零样本分类，共收到34个提交。NLNDE团队在任务A和B上实现了最佳系统，分别达到71.31和75.06的加权F1值。UCAS-IIE-NLP在任务C上平均获得了最佳系统，加权F1值为58.15。我们描述了最佳10个系统及其方法采用的各种方法。