项目名称: 基于匹配节点裁剪的图数据库关键词搜索的优化理论与方法
项目编号: No.61202036
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 钟鸣
作者单位: 武汉大学
项目金额: 23万元
中文摘要: 随着NoSQL运动的兴起和在社交网络等热门应用中产生了越来越多有价值的图数据,图数据库开始成为重要研究趋势并正被迅速地产业化。将信息检索领域的关键词搜索用于图数据库的查询,能让用户不必掌握复杂的查询语言和数据库模式就能查找结构化信息,成为了一种应用前景非常光明的新技术。然而,图数据库关键词搜索的计算复杂度极高,其效率和可伸缩性问题一直缺乏有效的解决途径。本项目将研究图数据库关键词搜索的优化理论与技术,主要研究内容包括:设计一种启发式的不依赖于具体搜索算法的匹配节点裁剪策略,可合理约束搜索结果的质量和语义模式以缩减搜索空间;通过对用户行为和数据自身关联的统计分析,发现用户感兴趣的语义模式集,并设计可同时与大量语义模式进行高效匹配的算法;有效组织与利用图数据库中的局部拓扑与语义信息,构造一个可实现在搜索前快速裁剪匹配节点的索引结构,并研发索引的分布并行处理技术以更好地支持大规模图数据库。
中文关键词: 图数据库;关键词搜索;索引;;
英文摘要: With the rise of the NoSQL movement, and also because more and more valuable graph data become available in the popular applications like social networks, graph database has become one of the most important research trends. Meanwhile, a wide range of industry products related to graph database are being developed rapidly. As a typical technology in the field of information retrieval, keyword search has been one of the major ways of querying graph databases currently. It enables users to retrieve structured information without having to know the complex query languages and database schemas at all. Hence, keyword search on graph databases is a very promising new technology. However, keyword search on graph databases is also known as a very hard problem because of its extremely high computational complexity. Due to the lack of effective solutions, its efficiency and scalability are very poor. This project will study the optimization theory and techniques for keyword search on graph databases. The main research objectives include the follows. Design a heuristic matched vertex pruning strategy which does not depend on the specific search algorithms. By using the pruning strategy, we can reasonably setup constraints of the quality and semantic patterns of search results for reducing the search space. Through statistic
英文关键词: graph database;keyword search;index;;