饮用漂白剂还是现在做什么? Covid-HERA:当着COVID19误报的面进行风险知情的健康决策的数据集 (Drink bleach or do what now? Covid-HeRA: A dataset for risk-informed health decision making in the presence of COVID19 misinformation)

Given the wide spread of inaccurate medical advice related to the 2019 coronavirus pandemic (COVID-19), such as fake remedies, treatments and prevention suggestions, misinformation detection has emerged as an open problem of high importance and interest for the NLP community. To combat potential harm of COVID19-related misinformation, we release Covid-HeRA, a dataset for health risk assessment of COVID-19-related social media posts. More specifically, we study the severity of each misinformation story, i.e., how harmful a message believed by the audience can be and what type of signals can be used to discover high malicious fake news and detect refuted claims. We present a detailed analysis, evaluate several simple and advanced classification models, and conclude with our experimental analysis that presents open challenges and future directions.

翻译：鉴于与2019年科罗纳病毒大流行(COVID-19)有关的不准确的医疗建议(如假药、治疗和预防建议)的广泛扩散,错误信息检测已成为全国人民党社会一个非常重要和感兴趣的公开问题,为了消除与COVID19有关的错误信息的潜在危害,我们公布了Covid-Hera,这是评估与COVID-19有关的社交媒体文章的健康风险的数据集。更具体地说,我们研究了每个错误信息的严重性,即听众相信的信息可能是多么有害,以及什么类型的信号可用于发现高端恶意假消息和被反驳的声称。我们提出了详细分析,评估了若干简单和先进的分类模式,并以我们提出的挑战和未来方向的实验性分析结束。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日