项目名称: 基于数据空间的海量数据处理方法与关键技术
项目编号: No.61272185
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 王念滨
作者单位: 哈尔滨工程大学
项目金额: 82万元
中文摘要: 海量数据处理在科学探索、环境保护、网络应用、商业智能、生物计算等领域有着广泛的研究价值和应用前景。海量数据处理是围绕数据展开的,其核心问题是数据的组织管理与分析方法。与传统的数据处理方法比较,目前海量数据具有的大容量、多格式特征对数据管理方法和数据处理能力提出了新的挑战。本项目研究以构建高效、可靠的大规模数据处理平台为目标,重点研究在无共享群集环境中大容量、多格式数据组织管理,高性能数据查询处理等关键技术。研究数据空间环境下海量数据的组织管理方法,提出多格式数据的组织管理模型,集成结构化、非结构化、半结构化数据,构建统一数据组织模型;研究数据空间环境下的高效索引策略,探讨数据空间环境下的海量数据负载均衡策略以提高系统的性能;研究数据空间环境下的语义缓存技术,提高系统响应能力。研究成果将为海量数据处理提供良好的理论基础,具有广阔的应用前景和重要的理论研究价值。
中文关键词: 数据空间;数据模型;索引;查询;
英文摘要: Large-scale data processing has a wide range of research value and application prospect in many fields,such as scientific exploration, environmental protection, network applications, business intelligence, bio-technology, and so on. In the massive data processing, data is its cornerstone and the core issue is the organization management and analysis methods of data. Compared with traditional data processing, massive data has large-capacity and multi-format characteristics, so it is faced with new challenges for us to manage and deal with data. In this project research, our target is to build an efficient and reliable large-scale data processing platform. Then we mainly study and discuss massive data processing from four aspects in detail. To begin with, we focus on the research of some key technologies in the no-shared cluster environment, such as high-performance data processing, large-capacity, multi-format data organization and management; Second, we investigate the massive data organization and management methods in the data space environment, present a model for organizing and managing multi-format data , integrate the structured, unstructured, semi-structured data, build a unified data organizational model; Third, we study an efficient indexing strategy in the data space environment, explore a massive d
英文关键词: Data Space;Data Model;Index;Query;