False news that spreads on social media has proliferated over the past years and has led to multi-aspect threats in the real world. While there are studies of false news on specific domains (like politics or health care), little work is found comparing false news across domains. In this article, we investigate false news across nine domains on Weibo, the largest Twitter-like social media platform in China, from 2009 to 2019. The newly collected data comprise 44,728 posts in the nine domains, published by 40,215 users, and reposted over 3.4 million times. Based on the distributions and spreads of the multi-domain dataset, we observe that false news in domains that are close to daily life like health and medicine generated more posts but diffused less effectively than those in other domains like politics, and that political false news had the most effective capacity for diffusion. The widely diffused false news posts on Weibo were associated strongly with certain types of users -- by gender, age, etc. Further, these posts provoked strong emotions in the reposts and diffused further with the active engagement of false-news starters. Our findings have the potential to help design false news detection systems in suspicious news discovery, veracity prediction, and display and explanation. The comparison of the findings on Weibo with those of existing work demonstrates nuanced patterns, suggesting the need for more research on data from diverse platforms, countries, or languages to tackle the global issue of false news. The code and new anonymized dataset are available at https://github.com/ICTMCG/Characterizing-Weibo-Multi-Domain-False-News.
翻译:过去几年来,在社交媒体上传播的虚假新闻在过去几年中激增,并导致现实世界中出现多重威胁。虽然对特定领域(如政治或医疗保健)的虚假新闻进行了研究,但发现对各领域的虚假新闻比较很少。在本篇文章中,我们调查了2009年至2019年中国最大的Twitter类社交媒体平台Weibo上九个领域的虚假新闻。新收集的数据包括9个领域的44 728个文章,由40 215个用户发布,并重新张贴了340万次以上。根据多域数据集的发布和扩展,我们看到,在诸如保健和医药等与日常生活密切相关的领域中出现的虚假新闻很少出现。在2009年至2019年之间,我们在中国最大的Twitter类社交媒体平台上广泛传播的虚假新闻文章与某些类型的用户有密切联系。这些文章激起了新版本的情绪,并进一步传播了网络数据集,我们发现在网上的错误数据启动者积极参与了虚拟数据库/网络的启动者。 我们的研究结果显示,从真实性数据采集数据到现有数据系统,我们现有的数据展示了现有数据分析结果。