项目名称: 面向微博数据的位置相关事件检测和时空异常聚类模式挖掘研究
项目编号: No.41471327
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 天文学、地球科学
项目作者: 万幼
作者单位: 武汉大学
项目金额: 64万元
中文摘要: 事件检测是信息处理领域的研究热点,对事件发生地点的准确识别和分析是事件检测的一个重要组成部分。然而,现有事件检测方法在文本地理信息的获取和文本地理空间语义关系的计算上存在不足,使得提取事件时缺乏必要的位置关系参考,事件定位困难,也无法对事件的空间分布进行有效分析。本项目利用地理信息检索、空间数据挖掘的相关技术方法,与传统信息处理方法相结合,进行位置相关事件的检测及数据挖掘研究。微博是一种新型的社交媒体,具有信息量大、实时更新等特点,并且微博数据中含有丰富的地理空间信息。本项目拟以新浪微博数据为研究对象,设计有效的地名识别与消歧方法,建立自适应的地理空间语义相关度和文本语义相关度融合模型,以实现位置相关事件的检测。更进一步的,对这些事件在时空分布上的显著聚集效果和异常聚类模式进行挖掘。本项目的研究将为及时、准确的获悉事件的发生地、空间分布、时空演化等提供有效的方法和技术支持。
中文关键词: 时空聚类;空间异常模式;知识发现
英文摘要: Event detection is a hot topic of information processing research area. An event can be defined as a thing occurred at a specific time and location. Therefore, it is an important task to exact the location accurately and do analysis on those event related locations. However, existing event detection methods do not pay much attention on this, and have some deficiencies: (1) They cannot obtain the accurate geographic information in text, which lead to the uncertainty of final event's location. (2) They do not calculate the relevance on text's spatial relationship and geo-semantics, so the events are lack of necessary geographical and spatial reference. (3) No further spatial analysis and spatial data mining research can be done appositely on those events, and no spatial distribution knowledge can be concluded too. To solve the problems above, this project will use geographical information retrieval approaches and spatial data mining technologies, combines with the traditional information processing methods to do geo-location related event detection and knowledge discovery. Microblogging is a new type of social media, with wide coverage, high degree of precision, large amount of information, spread fast, real-time updates and many other features. Moreover, microblogging data is in rich of geospatial information, which can be used for event detection. This project focuses on Sina Weibo data, do researches below: (1) Design effective place name recognition and disambiguation method based on those kinds of geospatial information, (2) Establish self-adaptive geospatial semantic relevancy and text semantic relevancy fusion model, and to achieve the detection of geo-location related events. (3) Do spatial data mining on those events, in order to find spatio-temporal abnormal cluster patterns. These researches will give effective methodology and technologies support on detection of place of occurrence, spatial distribution, spatio-temporal evolution of events in text. Also improving the accuracy and real-timing.
英文关键词: Spatio-temporal Cluster;Spatial Abnormal Pattern;Knowledge Discovery