项目名称: 云环境下基于BSP模型的大规模不动点迭代计算研究
项目编号: No.61300023
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 张岩峰
作者单位: 东北大学
项目金额: 25万元
中文摘要: 不动点迭代广泛存在于数据挖掘和机器学习算法中,在社会网络分析、高性能计算、推荐系统、搜索引擎、模式识别等领域都有广泛应用。近年来,人们开始利用云环境进行大规模不动点迭代计算以适应大数据处理的需要,这也是当今云计算和大数据领域的研究热点,并且已经取得了一系列研究成果。本申请基于这些已有工作,以BSP(Bulk Synchronous Parallel)模型为基础,研究适合大规模不动点迭代计算的改进BSP模型。针对大数据新形势下的性能优化需求,研究基于多初始点的迭代过程优化、基于差别消息的异步迭代模型、基于数据依赖关系的增量处理技术,从多个方面提高大规模不动点迭代计算的处理速度。另外,为了便于验证和推广研究成果,本课题将基于研究内容,实现一个支持大规模不动点迭代计算的分布式计算框架原型系统。
中文关键词: 分布式计算框架;迭代计算;图处理;BSP;大数据
英文摘要: Fixed point iterations widely exist in data mining and machine learning algorithms. These fixed point iterative algorithms are broadly used in the areas of online social networks, high-performance computing, recommendation systems, search engine, pattern recognition,etc. In recent years, in order to meet the needs of big data processing, people are exploiting cloud environment to launch large-scale fixed point iterative computations, which is a hot research topic in cloud computing and big data. Researchers have proposed a series of approaches and systems to support large-scale fixed point iterative computations under cloud environment. In this proposal, based on these previous works, we extend BSP (Bulk Synchronous Parallel) model to support large-scale fixed point iterative computations. To address the recently emerged challenges in big data processing, we will research on the multi-start iterative process, delta-based asynchronous iteration model, and dependency-based incremental processing. These research works aim at improving the performance of large-scale iterative computations from various aspects. In addition, in order to test and publicize our research results, we will design and implement a distributed computing framework prototype supporting large-scale iterative computations, which will integrate al
英文关键词: distributed computing framework;iterative computation;graph processing;BSP;Big data