项目名称: 云计算环境下基于图模型的海量RDF数据管理关键技术研究
项目编号: No.61502504
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 自动化技术、计算机技术
项目作者: 卢卫
作者单位: 中国人民大学
项目金额: 22万元
中文摘要: 云计算技术在互联网行业的巨大商业成功,启发了社会各界积极探索借助该技术来升级其业务和分析系统。在此背景下,本课题以互联网上普遍使用并以指数速度增长的海量RDF数据为研究对象,探索如何在云计算平台下使用分布式图计算框架来高效地管理RDF数据的关键技术:以SPARQL查询为对象,探索如何在该框架下构建实时的SPARQL查询处理系统,和以RDF数据为对象,研究如何在该框架下设计最优数据划分和自适应迁移算法。围绕这两个科学问题,本课题将从五个方面开展研究: 云计算环境下支持实时性处理要求的分布式图计算框架;面向SPARQL查询处理的分布式图匹配优化技术;基于图顶点度数分布的图划分算法;基于查询分布的RDF数据自适应迁移算法;原型系统研制和应用示范。通过对各项核心技术和原型系统的研究,深化对海量RDF数据特性的理解,掌握云平台上海量RDF数据管理的关键技术,为未来大规模开展语义网应用研究奠定基础。
中文关键词: 语义网;RDF数据;云计算;分布式图计算;图划分
英文摘要: The tremendous commercial success in the internet industry inspires the whole society to actively explore cloud computing to upgrade their business and analysis systems. In this context, we take the massive RDF data as the research object, which is of widespread use in the internet and its amount is still growing exponentially. We explore how to utilize the distributed graph computing abstraction to effectively process RDF queries under the cloud computing environment, including (1) how to build a real-time RDF data query processing system by taking RDF data as research object in the distributed environment; (2) how to properly partition the RDF data under different data distributions as well as self-adaptively migrate the RDF data so that the system can provide high scalability and efficiency. With respect to these two scientific issues, we will carry out the research from the following five aspects: study on real-time distributed graph computing framework in the Cloud; study on distributed graph matching optimization techniques for processing RDF data queries; study on graph partitioning algorithms based on degree distributions; study on self-adaptive data migration algorithms based on different query distribution; prototype system development and demonstration; The research over all kinds of key techniques and prototypes, helps deepen the understanding of RDF data and grasp the key techniques of managing massive RDF data in the Cloud, and provides a solid foundation for carrying out tremendous effect of research over semantic web in the future.
英文关键词: Semantic Web;RDF Data;Cloud Computing;Distributed Graph Computing;Graph Partitioning