项目名称: 面向时空大数据分析的可扩展存储与索引技术研究
项目编号: No.61300030
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 谭浩宇
作者单位: 广州市香港科大霍英东研究院
项目金额: 23万元
中文摘要: 随着移动互联网、物联网、高速无线网络等技术的发展,智能手机、车载定位系统、以及多种传感器等设备已经被普遍使用,产生了大量具有时间与位置属性的数据。根据其共性,这些数据被统称为时空大数据。城市规划、智能交通、基于位置的服务等许多重要应用需要对时空大数据进行分析。传统的数据存储与索引技术对可扩展性不够重视,导致查询效率低下、数据加载缓慢、索引创建与维护无法并行化等诸多问题,难以处理超大规模的时空数据,无法为时空大数据分析提供良好支撑。本项目将针对时空大数据管理的挑战,研究可扩展的存储与索引结构,具体内容包括:基于时空邻近性的数据分布式存储与索引、时空数据的高效查询处理与快速数据加载,以及大规模时空索引的并行化创建与增量更新。本项目的研究将对时空大数据相关应用的发展起到积极的推动作用。
中文关键词: 大数据管理;时空数据库;数据挖掘;分布式系统;数据加载
英文摘要: With the development of mobile Internet, Internet of things, high-speed wireless network, a large number of devices capable of reporting their locations have been widely used. These devices, such as smart phones, in-car positioning system, as well as a variety of sensors, continuously generate huge amounts of spatio-temporal data which we refer to as big spatio-temporal data. Many emerging applications including urban planning, smart transportation and location-based services require extensive analysis of big spatio-temporal data. Existing data management systems do not have sufficient scalability for storing and indexing ultra-large-scale spatio-temporal data, leading to inefficient query processing, slow data loading, and many other issues. This project will study scalable storage and index structures, which specifically includes: spatio-temporal proximity-based data partitioning and indexing, efficient and scalable processing of spatio-temporal queries, and parallel buiding and increment updating of very large spatio-temporal indexes. The research of this project will play a positive role in promoting the development of applications related to big spatio-temporal data analytics.
英文关键词: Big data management;Spatio-temporal database;Data mining;Distributed system;Data loading