Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This proposal aims to facilitate, to a researcher interested, the process of mining events on Twitter by opening a collection of processed information taken from Twitter since December 2015. The events could be related to natural disasters, health issues, and people's mobility, among other studies that can be pursued with the library proposed. Different applications are presented in this contribution to illustrate the library's capabilities: an exploratory analysis of the topics discovered in tweets, a study on similarity among dialects of the Spanish language, and a mobility report on different countries. In summary, the Python library presented is applied to different domains and retrieves a plethora of information in terms of frequencies by day of words and bi-grams of words for Arabic, English, Spanish, and Russian languages. As well as mobility information related to the number of travels among locations for more than 200 countries or territories.
翻译:仅需要几步才能获取信息,而且有很多图书馆可以在这方面提供帮助。然而,如果知道某个特定事件是否在Twitter上出现,这是一项具有挑战性的任务,需要收集大量推文。这个提案的目的是通过开放2015年12月以来从Twitter上收集的经过处理的信息集,促进Twitter上的采矿活动进程。事件可能与自然灾害、健康问题和人员流动有关,以及可以与图书馆提议的其他研究有关。本材料中提出了各种应用,以说明图书馆的能力:对在Twitter上发现的专题进行探索性分析,对西班牙语方言之间的相似性进行研究,以及一份关于不同国家的流动情况报告。简而言之,Python图书馆在不同的领域应用,并用阿拉伯语、英语、西班牙语和俄语按每天的语种和双语种的频率检索了大量信息。此外,还有与200多个国家或地区不同地点之间旅行次数相关的流动信息。