PCL detection task is aimed at identifying and categorizing language that is patronizing or condescending towards vulnerable communities in the general media.Compared to other NLP tasks of paragraph classification, the negative language presented in the PCL detection task is usually more implicit and subtle to be recognized, making the performance of common text-classification approaches disappointed. Targeting the PCL detection problem in SemEval-2022 Task 4, in this paper, we give an introduction to our team's solution, which exploits the power of prompt-based learning on paragraph classification. We reformulate the task as an appropriate cloze prompt and use pre-trained Masked Language Models to fill the cloze slot. For the two subtasks, binary classification and multi-label classification, DeBERTa model is adopted and fine-tuned to predict masked label words of task-specific prompts. On the evaluation dataset, for binary classification, our approach achieves an F1-score of 0.6406; for multi-label classification, our approach achieves an macro-F1-score of 0.4689 and ranks first in the leaderboard.
翻译:PCL 检测任务旨在确定和分类普通媒体中向弱势社区倾斜或居高临下的语言。与普通媒体中其他NLP的段落分类任务相比,PCL 检测任务中的负面语言通常更加隐含和微妙,使通用文本分类方法的性能失望。本文针对SemEval-2022任务4中的PCL检测问题,介绍了我们团队的解决方案,该解决方案利用了对段落分类的快速学习能力。我们重新配置了该任务,将其作为适当的阻塞快速功能,并使用预先培训的隐蔽语言模型填补凝块槽。对于两个子任务、二进式分类和多标签分类,DBERBTA模式被采用并经过微调,以预测特定任务提示的蒙面标签词。在评估数据集中,我们的方法达到了0.64066;在多标签分类方面,我们的方法达到了0.4689和首列的宏观F1核心。