AraCOVID19-MFH:阿拉伯文 COVID-19 多标签假冒新闻和仇恨言论检测数据集 (AraCOVID19-MFH: Arabic COVID-19 Multi-label Fake News and Hate Speech Detection Dataset) - 专知论文

会员服务 ·

0

COVID-19 · 数据集 · INFORMS · MoDELS · 分类模型 ·

2021 年 5 月 7 日

AraCOVID19-MFH: Arabic COVID-19 Multi-label Fake News and Hate Speech Detection Dataset

翻译：AraCOVID19-MFH:阿拉伯文 COVID-19 多标签假冒新闻和仇恨言论检测数据集

Mohamed Seghir Hadj Ameur,Hassina Aliane

Along with the COVID-19 pandemic, an "infodemic" of false and misleading information has emerged and has complicated the COVID-19 response efforts. Social networking sites such as Facebook and Twitter have contributed largely to the spread of rumors, conspiracy theories, hate, xenophobia, racism, and prejudice. To combat the spread of fake news, researchers around the world have and are still making considerable efforts to build and share COVID-19 related research articles, models, and datasets. This paper releases "AraCOVID19-MFH" a manually annotated multi-label Arabic COVID-19 fake news and hate speech detection dataset. Our dataset contains 10,828 Arabic tweets annotated with 10 different labels. The labels have been designed to consider some aspects relevant to the fact-checking task, such as the tweet's check worthiness, positivity/negativity, and factuality. To confirm our annotated dataset's practical utility, we used it to train and evaluate several classification models and reported the obtained results. Though the dataset is mainly designed for fake news detection, it can also be used for hate speech detection, opinion/news classification, dialect identification, and many other tasks.

翻译：在COVID-19大流行的同时,出现了一个虚假和误导信息的“信息”,使COVID-19回应努力复杂化了。Facebook和Twitter等社交网站在很大程度上促进了流言、阴谋理论、仇恨、仇外心理、种族主义和偏见的传播。为了遏制虚假新闻的传播,世界各地的研究人员已经而且仍在作出相当大的努力,以建立和分享COVID-19相关研究文章、模型和数据集。本文发行了“AraCOVID19-MFH”一幅人工标记的多标签阿拉伯文 COVID-19假新闻和仇恨言论探测数据集。我们的数据集包含10 828个阿拉伯推特,带有10个不同的标签。这些标签的设计是为了考虑与事实核对任务有关的某些方面,例如推特的校验价值、自相/强性和事实质量。为了证实我们的附加说明的数据集的实用性,我们用它来训练和评价若干分类模型并报告所获得的结果。尽管数据集主要设计为假新闻检测,但也可以用于识别仇恨言论、观点、其他辩证和辩证任务。

0

相关内容

COVID-19

大数据白皮书（2020年）, 72页pdf

大数据白皮书（2020年）, 72页pdf

专知会员服务

59+阅读 · 2020年12月31日

【KDD2020-Tutorial】深度学习异常检测，180页ppt

专知会员服务

109+阅读 · 2020年8月28日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

专知会员服务

79+阅读 · 2020年2月12日

【KDD2019|讲座推荐】工业中可解释的人工智能：Fake News Research: Theories, Detection Strategies, and Open Problems

专知会员服务

67+阅读 · 2019年12月9日

【目标检测 | 2019最新综述】目标检测的20年，附39页PDF，Object Detection in 20 Years: A Survey

【目标检测 | 2019最新综述】目标检测的20年，附39页PDF，Object Detection in 20 Years: A Survey

专知会员服务

60+阅读 · 2019年11月15日

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

专知会员服务

56+阅读 · 2019年11月15日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

专知会员服务

13+阅读 · 2019年8月27日

【VLDB2019】虚假新闻（Fake News）检测全面综述教程，156页PPT带你进入这一领域

【VLDB2019】虚假新闻（Fake News）检测全面综述教程，156页PPT带你进入这一领域

专知

10+阅读 · 2019年9月3日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

五个精彩实用的自然语言处理资源

五个精彩实用的自然语言处理资源

机器学习研究会

6+阅读 · 2018年2月23日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】Kaggle机器学习数据集推荐

【推荐】Kaggle机器学习数据集推荐

机器学习研究会

8+阅读 · 2017年11月19日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

机器学习研究会

9+阅读 · 2017年10月24日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

SaRoCo: Detecting Satire in a Novel Romanian Corpus of News Articles

Arxiv

0+阅读 · 2021年6月30日

Affective Image Content Analysis: Two Decades Review and New Perspectives

Arxiv

16+阅读 · 2021年6月30日

Anomaly Detection: How to Artificially Increase your F1-Score with a Biased Evaluation Protocol

Arxiv

0+阅读 · 2021年6月30日

Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection

Arxiv

0+阅读 · 2021年6月30日

Hate speech detection using static BERT embeddings

Hate speech detection using static BERT embeddings

Arxiv

0+阅读 · 2021年6月29日

New Arabic Medical Dataset for Diseases Classification

Arxiv

0+阅读 · 2021年6月29日

Hate Speech Detection in Clubhouse

Arxiv

0+阅读 · 2021年6月28日

Mining Dual Emotion for Fake News Detection

Arxiv

13+阅读 · 2020年10月19日

Linked Credibility Reviews for Explainable Misinformation Detection

Arxiv

4+阅读 · 2020年8月28日

Object Detection in 20 Years: A Survey

Object Detection in 20 Years: A Survey

Arxiv

48+阅读 · 2019年5月13日

VIP会员

文章信息

相关主题

相关VIP内容

大数据白皮书（2020年）, 72页pdf

大数据白皮书（2020年）, 72页pdf

专知会员服务

59+阅读 · 2020年12月31日

【KDD2020-Tutorial】深度学习异常检测，180页ppt

专知会员服务

109+阅读 · 2020年8月28日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

专知会员服务

79+阅读 · 2020年2月12日

【KDD2019|讲座推荐】工业中可解释的人工智能：Fake News Research: Theories, Detection Strategies, and Open Problems

专知会员服务

67+阅读 · 2019年12月9日

【目标检测 | 2019最新综述】目标检测的20年，附39页PDF，Object Detection in 20 Years: A Survey

【目标检测 | 2019最新综述】目标检测的20年，附39页PDF，Object Detection in 20 Years: A Survey

专知会员服务

60+阅读 · 2019年11月15日

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

专知会员服务

56+阅读 · 2019年11月15日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

【VLDB2019 tutorial】Combating Fake News: A Data Management and Mining Perspective，不列颠哥伦比亚大|Laks V.S. Lakshmanan，Michael Simpson，Sara Thirumuruganathan，156页PDF

专知会员服务

13+阅读 · 2019年8月27日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维和高维空间中分析、建模和转换潜在表征

从无人机到数据：揭示边缘计算作为新作战域

可解释人工智能的基础

大规模视觉模型中的基于提示的适应：综述

相关资讯

【VLDB2019】虚假新闻（Fake News）检测全面综述教程，156页PPT带你进入这一领域

【VLDB2019】虚假新闻（Fake News）检测全面综述教程，156页PPT带你进入这一领域

专知

10+阅读 · 2019年9月3日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

五个精彩实用的自然语言处理资源

五个精彩实用的自然语言处理资源

机器学习研究会

6+阅读 · 2018年2月23日

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

【推荐】(TensorFlow)SSD实时手部检测与追踪（附代码）

机器学习研究会

11+阅读 · 2017年12月5日

【推荐】Kaggle机器学习数据集推荐

【推荐】Kaggle机器学习数据集推荐

机器学习研究会

8+阅读 · 2017年11月19日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

机器学习研究会

9+阅读 · 2017年10月24日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

SaRoCo: Detecting Satire in a Novel Romanian Corpus of News Articles

Arxiv

0+阅读 · 2021年6月30日

Affective Image Content Analysis: Two Decades Review and New Perspectives

Arxiv

16+阅读 · 2021年6月30日

Anomaly Detection: How to Artificially Increase your F1-Score with a Biased Evaluation Protocol

Arxiv

0+阅读 · 2021年6月30日

Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection

Arxiv

0+阅读 · 2021年6月30日

Hate speech detection using static BERT embeddings

Hate speech detection using static BERT embeddings

Arxiv

0+阅读 · 2021年6月29日

New Arabic Medical Dataset for Diseases Classification

Arxiv

0+阅读 · 2021年6月29日

Hate Speech Detection in Clubhouse

Arxiv

0+阅读 · 2021年6月28日

Mining Dual Emotion for Fake News Detection

Arxiv

13+阅读 · 2020年10月19日

Linked Credibility Reviews for Explainable Misinformation Detection

Arxiv

4+阅读 · 2020年8月28日

Object Detection in 20 Years: A Survey

Object Detection in 20 Years: A Survey

Arxiv

48+阅读 · 2019年5月13日

微信扫码咨询专知VIP会员