False claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns, thus posing a threat to global public health. Misinformation originating from various sources has been spreading online since the beginning of the COVID-19 pandemic. In this paper, we present a dataset of Twitter posts that exhibit a strong anti-vaccine stance. The dataset consists of two parts: a) a streaming keyword-centered data collection with more than 1.8 million tweets, and b) a historical account-level collection with more than 135 million tweets. The former leverages the Twitter streaming API to follow a set of specific vaccine-related keywords starting from mid-October 2020. The latter consists of all historical tweets of 70K accounts that were engaged in the active spreading of anti-vaccine narratives. We present descriptive analyses showing the volume of activity over time, geographical distributions, topics, news sources, and inferred account political leaning. This dataset can be used in studying anti-vaccine misinformation on social media and enable a better understanding of vaccine hesitancy. In compliance with Twitter's Terms of Service, our anonymized dataset is publicly available at: https://github.com/gmuric/avax-tweets-dataset
翻译:有关COVID-19疫苗的虚假说法会破坏公众对正在进行的疫苗接种运动的信任,从而对全球公共健康构成威胁。自COVID-19大流行开始以来,各种来源的错误信息一直在网上传播。在本文中,我们展示了一个显示强烈反疫苗立场的Twitter文章数据集。该数据集由两部分组成:(a) 以关键词为主的流式关键词数据收集,有超过180万次推文;(b) 历史账户级收集,有超过1.35亿次推文。前者利用Twitter流出API跟踪一套与疫苗有关的特定关键词,从2020年10月中旬开始。后者包括所有70K账户的历史推文,这些账户参与积极传播反疫苗叙事。我们提供描述性分析,显示时间、地理分布、专题、新闻来源和推断账户政治精度。该数据集可用于研究社会媒体上的反疫苗错误信息,并使人们更好地了解疫苗的疫苗。根据Twitter服务术语,我们匿名/commusetasatat: httpcommission-commressetat。