We present the Global Health Monitor, an online Web-based system for detecting and mapping infectious disease outbreaks that appear in news stories. The system analyzes English news stories from news feed providers, classifies them for topical relevance and plots them onto a Google map using geo-coding information, helping public health workers to monitor the spread of diseases in a geo-temporal context. The background knowledge for the system is contained in the BioCaster ontology (BCO) (Collier et al., 2007a) which includes both information on infectious diseases as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and visualization. Evaluation of the system shows that it achieved high accuracy on a gold standard corpus. The system is now in practical use. Running on a clustercomputer, it monitors more than 1500 news feeds 24/7, updating the map every hour.
翻译:我们推出全球健康监测系统,这是一个在线网络系统,用于检测和测绘新闻报道中出现的传染病爆发情况,该系统分析来自新闻提供方的英文新闻报道,将其分类为专题相关性,并使用地理编码信息将其投放到谷歌地图上,帮助公共卫生工作者监测疾病在地理时空背景下的传播情况,该系统的背景知识载于BioCaster肿瘤学(BCO)(Collier等人,2007年a),其中包括传染病信息以及具有纬度/纬度的地理位置。该系统由四个主要阶段组成:专题分类、命名实体识别(NER)、疾病/地点探测和可视化。对系统的评估表明,该系统在金质标准体中实现了高度准确性。该系统现在实际使用。在集束计算机上运行,它每天24/7天监测1500多个新闻源,每小时更新地图。