项目名称: 面向MapReduce的网络存储系统优化技术研究
项目编号: No.61272528
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 薛瑞尼
作者单位: 电子科技大学
项目金额: 82万元
中文摘要: 以云计算为基础的MapReduce编程模型是当前海量数据处理的重要的方法,提高MapReduce存储系统的扩展性、可靠性、存储效率和数据访问性能是实际应用的迫切需求,也是未来基于海量数据信息服务所面临的挑战。本项目以MapReduce的文件访问模式为依据,以海量数据的高效存储和高并发访问为目标,研究MapReduce存储系统的优化技术,内容包括:1)旨在提高系统扩展性和可靠性的分布式元数据管理技术;2)旨在提高系统存储效率的自适应文件分块技术;3)旨在提高数据访问性能的数据预取技术。本课题通过解决MapReduce实际应用遇到的瓶颈,探索常规分布式存储系统和MapReduce存储系统融合的框架和方法,为更深层次的、更复杂的存储系统优化提供新的理论和支撑工具。
中文关键词: 分布式系统;高可用性;高可靠性;调度优化;存储优化
英文摘要: MapReduce is one of the most important approaches for massive data processing based on the cloud computing paradigm. It's not only urgent requirements of real world applications to improve the scalability, dependability, storage efficiency and data access performance of storage system for MapReduce, but also the challenges for future information services targeted to massive data. To address these issues, this proposal would conduct research on these following aspects by taking the file usage pattern of MapReduce as the starting point and aiming at the storage efficiency ad high concurrent accesses for massive data: 1) Distributed metadata management to improve scalability and dependability; 2) Adaptive file chunking to improve storage efficiency; 3) Data prefetching to improve file access performance. Besides eliminating the bottlenecks encountered in real world applications, this proposal also contributs in discussing the framework and scheme for the fusion of typical distributed storage systems and MapReduce storage system, which would provide new theories and supporting tools for more comprehensive, more specific storage system optimization.
英文关键词: Distributed System;High Availability;High Reliability;Scheduling Optimization;Storage Optimization