We present a novel collection of news articles originating from fake and real news media sources for the analysis and prediction of news virality. Unlike existing fake news datasets which either contain claims or news article headline and body, in this collection each article is supported with a Facebook engagement count which we consider as an indicator of the article virality. In addition we also provide the article description and thumbnail image with which the article was shared on Facebook. These images were automatically annotated with object tags and color attributes. Using cloud based vision analysis tools, thumbnail images were also analyzed for faces and detected faces were annotated with facial attributes. We empirically investigate the use of this collection on an example task of article virality prediction.
翻译:我们展示了来自假的和真实的新闻媒体来源的新颖新闻文章集,用于分析和预测新闻的病毒性。与现有的包含权利主张或新闻文章标题和实体的假新闻数据集不同,在这一集中,每篇文章都得到一个Facebook订婚数字的支持,我们认为该数字是文章病毒性的一个指标。此外,我们还提供了在Facebook上分享文章的文章描述和缩略图。这些图像自动附加了对象标签和颜色属性。使用基于云的视觉分析工具,对缩略图进行了脸部分析,并用面部特征对所检测到的脸部进行了附加说明。我们从经验上调查了在文章病毒性预测的示例中如何使用这些收藏。