项目名称: 大数据高效能存储与管理方法研究
项目编号: No.U1435216
项目类型: 联合基金项目
立项/批准年度: 2015
项目学科: 自动化技术、计算机技术
项目作者: 武永卫
作者单位: 清华大学
项目金额: 105万元
中文摘要: 随着数据规模的爆炸式增长和数据模式的高度复杂化,世界已进入网络化的大数据时代。大数据的多样化与持续快速增长、用户对大数据处理与存储的多种类需求等都对大数据的存储与管理提出了新挑战。本项目开展大数据高效能存储与管理方法研究,提出针对大数据应用I/O特征定制的按需构建机制和可自调整运行优化方法,支持多个大数据组织与管理模式的同时运行机制,实现多层次存储介质硬件部件的整体利用率和单位时间内大数据处理条目数两方面的综合效能提升。项目从大数据高效能存储与管理体系、支撑技术和应用验证三个方面,解决应用定制存储系统的按需定制与动态调整、应用存储特性灵敏感知与自回馈机制、高并发多种类复杂数据存储的效能管理、数据动态聚散机制、存算传融合的存储调度、多应用定制存储系统共存的系统效能评测方法等问题,最后通过卫星遥感大数据流式数据管理和分析、海洋环境监测大数据统计分析等开展应用验证。
中文关键词: 大数据;存储系统;高效能;按需定制;动态自调整
英文摘要: With the explosive growth of data size and data complexity, it is no doubt that we are stepping into the Big Data era. However, the diversity and continued increment of both data and users' requirements put forward new challenges on the existing data storage and management systems. In order to achieve high efficiency in terms of both hardware utilization at different levels and the number of data.items processed per unit time, this project aims to develop a set of Big Data storage and management technologies, including the way to construct I/O-specific (application customizable) storage systems on-demand, the method to implement dynamic self-tuning of the constructed system, and the mechanism to simultaneously run multiple big data organization and management models on a pool of various.devices. More specifically, the project focuses on three aspects of high-efficient big data storage and management (i.e., architecture, supporting techniques and application verification) and tries to solve such problems as how to construct an application customizable storage system on demand and to tune it dynamically, how to efficiently sense and classify I/O behavior of applications and automatically feed the result back to the storage management system, how to effectively and efficiently manage the storage of various high-concurrent complex data, how to gather and scatter data, how to schedule storage tasks in a storage-compute-transfer-aware way, and how to evaluate a storage system that allows for the simultaneous run of multiple application customizable storage systems. Finally, a system will be developed and evaluated with real data from satellite remote sensing and ocean monitoring.
英文关键词: Big Data;Storage System;High Efficiency;On-demand Customization ;Dynamic Self-tuning