基于语义计算的海量Deep Web知识探索机制研究

项目名称： 基于语义计算的海量Deep Web知识探索机制研究

项目编号： No.61272411

项目类型： 面上项目

立项/批准年度： 2013

项目学科： 自动化技术、计算机技术

项目作者： 赵峰

作者单位： 华中科技大学

项目金额： 80万元

中文摘要： Deep Web蕴含的信息量丰富、质量高、内容领域性强、增长速度快，已逐步成为互联网承载信息的主体。近年来，搜索和发现海量Deep Web背后隐藏的用户所需要的信息，对其进行可靠管理、准确分析和全面理解，并提供普适化/个性化的知识服务，已成为国内外众多学者研究的热点与焦点。本项目针对Deep Web海量性、动态性、不确定性的特征，围绕搜索模式改变和数据"量-质"矛盾引发的挑战，以提高Deep Web信息的可用性、实现海量Deep Web的高效知识检索和发现为目标，研究海量Deep Web知识探索的数据发现与分类、数据采样、语义推算与动态演化、知识评估与检索优化等关键机制，为海量Deep Web的"量-质"融合和"信息-知识"转化奠定基础。本项选题具有先进性，研究具有重要的理论意义与实用价值，研究成果可直接应用于互联网资源管理，为其提供新的、有效技术手段，并拓宽互联网信息检索的研究领域。

中文关键词： 深层网络；知识探索；实时搜索；语义计算；数据爬取

英文摘要： Deep web refers to web data sources that provide a considerable amount of information with backend databases that are not indexed by general search engines, which contains abundant information and features with high quality,strong relevant to domains and high speeding rates. It gradually becomes the main body of the Internet information carrier. With the explosion of deep web, searching and discovering the knowledge of hidden web documents has become a perpetual challenge. Recently, reliably managing, accurately analyzing and understanding massive deep web become major goals of explorating deep web and providing pervasive and personalized knowledge service also becomes a hotspot research. Facing the features of massive, dynamics and uncertainty of deep web, and challenges caused by the conflict of data "quantity-quality" and searching model changing,this project researches some key mechanism of deep web exploration, such as mechanism on discoverying and classifying of massive deep web data sources, mechanism on deep web data sampling, mechanism on semantic computing and dynamically evoluting, mechanism on knowledge evaluation and retrieval optimization, which is helpful to establish foundation for the fusion of "quantity-quality" and conversion of "information-knowledge". The project aims to improve the usabilit

英文关键词： Deep web；Knowledge exploration；Real-time searching；Semantic computing；Data Crawling

成为VIP会员查看完整内容