项目名称: 云计算Hadoop框架中高效迭代机制的研究
项目编号: No.61201447
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 电子学与信息系统
项目作者: 朱颢东
作者单位: 郑州轻工业学院
项目金额: 24万元
中文摘要: 云计算Hadoop框架作为一种专门处理海量数据的新式计算模型,近年来受到极大关注并成为智能信息处理领域的研究热点。然而,前期研究表明,该模型在迭代操作方面性能较低,这在一定程度上限制了其应用能力。为此,本项目将着重研究云计算Hadoop框架中的迭代功能,设计新的迭代机制,使其能够有效支持迭代操作。首先,以增强云计算Hadoop框架的迭代操作性能为目的,在该框架的基础上设计一种新的云计算Hadoop框架;然后针对新框架,为其设计新的迭代控制模块以及新的应用程序接口,以使用户能方便地实现迭代操作;随后,在新框架中设计迭代可重用数据的缓存和检索模块,以减少从Master结点频繁重复下载这类数据带来的I/O操作并在使用时快速定位它们;最后,根据迭代的特点,设计相应的任务调度和容错方法。相关研究的进展和突破,必将会进一步丰富云计算Hadoop框架理论体系,为高效地处理海量数据提供一种更有效的手段。
中文关键词: 大数据;云计算;Hadoop 框架;迭代机制;
英文摘要: As a new special computation model for mass data processing, Hadoop framework for Cloud Computing has attracted great attention among domestic and abroad academic circles in recent years and become a research hotspot in intelligent information processing. However, the pre-project research results show that Hadoop framework can't effectively carry out iterative operation so that to some extent,its application ability is limited.This project will focus on the iteration function in Hadoop framework and design a new loop-iteration mechanism to efficiently support iterative operation.Firstly, we design a new Hadoop framework for Cloud Computing based on the old Hadoop framework in order to improve iterative operation performance.And then, according to the new framework we propose a new loop-iteration control module and an new application program interface to help users realize iterative operation more conveniently.Subsequently, we present the cache and index modules for iterative reusable data to reduce frequently load them from Master node and effectively index them form Slaver nodes. Lastly, we provide the corresponding task scheduling and fault-tolerant method according to the new loop-iteration mechanism.The progress and breakthrough of relevant researches in this project will further enrich the theoretical syste
英文关键词: Big Data;Cloud Computing;Hadoop Framework;Iterative Mechanism;