Black Lives Matter (BLM) is a decentralized social movement protesting violence against Black individuals and communities, with a focus on police brutality. The movement gained significant attention following the killings of Ahmaud Arbery, Breonna Taylor, and George Floyd in 2020. The #BlackLivesMatter social media hashtag has come to represent the grassroots movement, with similar hashtags counter protesting the BLM movement, such as #AllLivesMatter, and #BlueLivesMatter. We introduce a data set of 63.9 million tweets from 13.0 million users from over 100 countries which contain one of the following keywords: BlackLivesMatter, AllLivesMatter, and BlueLivesMatter. This data set contains all currently available tweets from the beginning of the BLM movement in 2013 to 2021. We summarize the data set and show temporal trends in use of both the BlackLivesMatter keyword and keywords associated with counter movements. Additionally, for each keyword, we create and release a set of Latent Dirichlet Allocation (LDA) topics (i.e., automatically clustered groups of semantically co-occuring words) to aid researchers in identifying linguistic patterns across the three keywords.
翻译:黑色生命物质(BLM)是一个分散化的社会运动,抗议针对黑人个人和社区的暴力,重点是警察暴力。该运动在2020年Ahmoud Arbery、Breonna Taylor和George Floyd被害后获得极大关注。#黑色生命物质社交媒体标签已经来到代表基层运动,类似的标签反对BLM运动,如#AllLiveStatter和#蓝色生命物质。我们引入了来自100多个国家的1 300万用户的一组数据,其中含有以下关键词之一:黑色闪电、全线闪电和蓝线闪电。这个数据集包含从2013年BLM运动开始到2021年所有现有的推文。我们总结了数据集,并展示了使用黑线闪电关键词和与反运动相关的关键词的时间趋势。此外,我们为每个关键词创建并释放了一套包含以下关键词之一的Litettn Diritlet 配置(LDA) 主题之一的数据集,从2013年到2021年BLM运动运动运动运动开始的自动组合三段语言模型。