Hate speech detection within a cross-lingual setting represents a paramount area of interest for all medium and large-scale online platforms. Failing to properly address this issue on a global scale has already led over time to morally questionable real-life events, human deaths, and the perpetuation of hate itself. This paper illustrates the capabilities of fine-tuned altered multi-lingual Transformer models (mBERT, XLM-RoBERTa) regarding this crucial social data science task with cross-lingual training from English to French, vice-versa and each language on its own, including sections about iterative improvement and comparative error analysis.
翻译:在跨语言环境中发现仇恨言论是所有中、大型在线平台最感兴趣的领域。 在全球范围未能适当解决这一问题,随着时间的推移,已经导致道德上令人怀疑的真实生活事件、人的死亡和仇恨本身的永久化。 本文说明了微调的多语言变异模型(mBERT、XLM-ROBERTA)在这项至关重要的社会数据科学任务方面的能力,通过英语到法语的跨语言培训,反之亦然,以及语言本身,包括关于迭接改进和比较错误分析的章节。