AraStance:用于事实检查的阿拉伯斯坦探测多国家和多领域数据集 (AraStance: A Multi-Country and Multi-Domain Dataset of Arabic Stance Detection for Fact Checking)

With the continuing spread of misinformation and disinformation online, it is of increasing importance to develop combating mechanisms at scale in the form of automated systems that support multiple languages. One task of interest is claim veracity prediction, which can be addressed using stance detection with respect to relevant documents retrieved online. To this end, we present our new Arabic Stance Detection dataset (AraStance) of 910 claims from a diverse set of sources comprising three fact-checking websites and one news website. AraStance covers false and true claims from multiple domains (e.g., politics, sports, health) and several Arab countries, and it is wellbalanced between related and unrelated documents with respect to the claims. We benchmark AraStance, along with two other stance detection datasets, using a number of BERTbased models. Our best model achieves an accuracy of 85% and a macro F1 score of 78%, which leaves room for improvement and reflects the challenging nature of AraStance and the task of stance detection in general.

翻译：随着在线错误和虚假信息的不断扩散,以支持多种语言的自动化系统的形式大规模发展打击机制就显得日益重要。一项令人感兴趣的任务是要求真实性预测,这可以通过对在线检索的相关文件的姿态检测加以解决。为此,我们介绍了由三个事实核对网站和一个新闻网站组成的一套不同来源的910项阿拉伯标准检测新数据集(AraStance)。AraStance涵盖多个领域(例如政治、体育、卫生)和几个阿拉伯国家的虚假和真实性索偿,在与索偿有关的相关和不相关文件之间保持平衡。我们用一些基于BERT的模型将AraStance和其他两个姿态检测数据集作为基准。我们的最佳模型实现了85%的准确率和78%的宏观F1分,这为改进留下了空间,并反映了AraStance具有挑战性的性质以及一般的姿态检测任务。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【康奈尔大学】度量数据粒度，Measuring Dataset Granularity

专知会员服务

13+阅读 · 2019年12月27日

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

专知会员服务

38+阅读 · 2019年12月1日