Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This proposal aims to facilitate, to a researcher interested, the process of mining events on Twitter. The events could be related to natural disasters, health issues, and people's mobility, among other studies that can be pursued with the library proposed. Different applications are presented in this contribution to illustrate the library's capabilities: an exploratory analysis of the topics discovered in tweets, a study on similarity among dialects of the Spanish language, and a mobility report on different countries. In summary, the Python library presented is applied to different domains and retrieves a plethora of information processed from Twitter (since December 2015) in terms of words, bi-grams of words, and their frequencies by day for Arabic, English, Spanish, and Russian languages. The mobility information is related to the number of travels among locations for more than 200 countries or territories; our library also provides access to this information.
翻译:不过,如果知道在推特上是否出现某一事件,则需要收集大量推文,这是一项艰巨的任务。这项提案旨在为有兴趣的研究人员提供便利,在Twitter上开展采矿活动。事件可能与自然灾害、健康问题和人员流动有关,以及可以与图书馆提议的其他研究有关。本材料中介绍了各种应用,以说明图书馆的能力:对在推文中发现的专题进行探索性分析,对西班牙语方言之间的相似性进行研究,以及不同国家的流动报告。简而言之,Python图书馆在不同的领域应用,从Twitter上检索了大量信息(自2015年12月起),从文字、双语、语言以及阿拉伯语、英语、西班牙语和俄语的频率等方面收集了大量信息。流动信息涉及200多个国家或地区之间旅行的次数;图书馆还提供获取这些信息的途径。