In hate speech detection, developing training and evaluation datasets across various domains is the critical issue. Whereas, major approaches crawl social media texts and hire crowd-workers to annotate the data. Following this convention often restricts the scope of pejorative expressions to a single domain lacking generalization. Sometimes domain overlap between training corpus and evaluation set overestimate the prediction performance when pretraining language models on low-data language. To alleviate these problems in Korean, we propose APEACH that asks unspecified users to generate hate speech examples followed by minimal post-labeling. We find that APEACH can collect useful datasets that are less sensitive to the lexical overlaps between the pretraining corpus and the evaluation set, thereby properly measuring the model performance.
翻译:在发现仇恨言论时,在各个不同领域发展培训和评价数据集是一个关键问题。主要做法是爬行社交媒体文本,雇用人群工人来说明数据。在这项公约之后,往往将贬义表达的范围限制在单一领域,但缺乏一般化。有时,在培训低数据语言模式培训前,培训教材和评价之间的领域重叠过高地估计了低数据语言模式的预测性能。为了缓解这些问题,我们建议APACH建议请未指明的用户在制作仇恨言论实例时,在贴上最低标签后,再制作最少的后贴标签。我们发现APACH可以收集有用的数据集,这些数据集对培训前材料与评估组之间的词汇重叠不太敏感,从而适当衡量模型性能。