Developing an understanding of the public discourse on COVID-19 vaccination on social media is important not only for addressing the current COVID-19 pandemic, but also for future pathogen outbreaks. We examine a Twitter dataset containing 75 million English tweets discussing COVID-19 vaccination from March 2020 to March 2021. We train a stance detection algorithm using natural language processing (NLP) techniques to classify tweets as `anti-vax' or `pro-vax', and examine the main topics of discourse using topic modelling techniques. While pro-vax tweets (37 million) far outnumbered anti-vax tweets (10 million), a majority of tweets from both stances (63% anti-vax and 53% pro-vax tweets) came from dual-stance users who posted both pro- and anti-vax tweets during the observation period. Pro-vax tweets focused mostly on vaccine development, while anti-vax tweets covered a wide range of topics, some of which included genuine concerns, though there was a large dose of falsehoods. A number of topics were common to both stances, though pro- and anti-vax tweets discussed them from opposite viewpoints. Memes and jokes were amongst the most retweeted messages. Whereas concerns about polarisation and online prevalence of anti-vax discourse are unfounded, targeted countering of falsehoods is important.
翻译:我们使用自然语言处理(NLP)技术培训定位检测算法,将推文归类为“反Vax”或“pro-vax”,并利用主题建模技术审查主要讨论议题。虽然亲Vax推文(3 700万)远远超出反Vax推文(1 000万),但关于反Vax推文和反Vax推文(63% 反Vax 推文)的许多话题与两种立场(63% 反Vax 和53% 亲Vax 推文)的推文(63% 反Vax 推文)都来自在观察期间张贴支持和反Vax 推文的双向用户。Pro-vax推文主要侧重于疫苗开发,而反Vax 推文则涵盖广泛的议题,其中一些包括大量反Vax推文的推文(1 000万),但关于反Vax的推文(63%)的推文和反向性推文的推文(反正反正的推文和反向)的推理和反向性推理(反向)观点是其中最重要的。