项目名称: 面向XML数据的关键字查询算法辅助生成技术研究
项目编号: No.61272124
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 陈子阳
作者单位: 燕山大学
项目金额: 80万元
中文摘要: 关键字查询作为一种有效查询手段,一直以来都是XML数据管理领域重点研究的问题之一。虽然国内外研究者提出了多种语义及其算法,并构建了不同的原型系统,但XML关键字查询技术仍停留在学术层面的讨论,主要原因在于实际中存在数据多样性和语义单一性的矛盾、需求多样性和实现复杂性的矛盾以及低效算法和大规模数据的矛盾,这严重妨碍了XML关键字查询技术在实际中的有效应用。和已有工作研究面向普通用户、基于单一语义的XML关键字查询技术不同,本项目重点研究面向研究者和系统开发人员、支持多语义、高效和可扩展的XML关键字查询算法辅助生成技术,包括求解满足多语义的结果超集、高效结果子树构建技术、有效的结果分组和展示策略以及高效的并行计算策略等问题,最终目标是构建支持多语义、高效和可扩展的XML关键字查询处理平台,用于协助研究者和系统开发人员快速构建系统并实现对各种语义的支持。
中文关键词: 可扩展标记语言;关键字查询;结果子树;自顶向下处理策略;
英文摘要: As an effective way to retrieve useful information, keyword search on XML data is taken as one of the most important research issues. Although researchers have proposed various semantics and algorithms, and constructed prototypes to verify the effectiveness of their solutions, keyword search mechanism on XML data is still remain in the level of academic discussion, because of the existence of the contradiction between (1) various data types and single semantics supported by a system, (2) different requirements on semantics and the complexity of implementing any algorithm, (3) inefficiency of existing algorithm and large scale XML data, which greatly hindered the application of keyword search techniques on XML data in practice. Different with existing methods that focus on supporting single semantics to facilitate common users, in this project, we focus on supporting multiple semantics, high efficiency and extendable techniques that can be used to help generate various algorithms to support different semantics. The main problems need to be solved include computing super set of various semantics, efficient constructing sub-trees, effective classification and ranking techniques among results, and efficient parallel execution strategy, etc. The final objective is construct a platform that is efficient and extendable
英文关键词: XML;keyword search;subtree reluts;top-down processing strategy;