项目名称: 基于本体的专利自动标引研究
项目编号: No.61271304
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 无线电电子学、电信技术
项目作者: 吕学强
作者单位: 北京信息科技大学
项目金额: 75万元
中文摘要: 在信息爆炸的今天,专利文献以几何级数量暴增,社会面临着海量信息检索困难、人工深度标引专利文献的成本日益增高等问题。本项目提出一种基于领域本体的专利自动标引方法。通过对专利术语挖掘、领域本体库构建、专利文本表示和标引词发现等关键技术的研究,重点解决专利自动标引中领域本体库构建、自动标引两大问题。本项目从专利查询日志用户检索信息和专利文本内容结构特征等方面入手,提出了基于查询特征的术语发现方法、基于领域耦合度的术语挖掘方法、基于结构-语域网和概念-词结构的专利文本表示模型。通过挖掘领域概念构建专利本体库,基于本体库实现概念相似度度量,达到专利自动标引的目的,进一步提高了标引词对专利文本描述的完备性、准确性。通过本项目的研究,可构建基于专利的本体库,实现内容更完整、语义更全面的专利文本表示,提高专利自动标引效率,为新一代专利信息检索提供标引理论、方法和资源的支持,促进国民经济和社会发展。
中文关键词: 专利文本;专利术语;专利本体;专利知识;专利标引
英文摘要: With the explosion of information nowadays, the patent documents boosted in a geometric level, which has caused a seires of problems, such as the difficulty of massive information retrieval, the increasing cost of artificial deep indexing of patent documents, etc.Thus this project proposes an automatic indexing method of patent based on domain ontology. According to the research of the mining of patent terms, the construction of the Domain Ontology Base, the presentation of patent text and the discovery of indexing words, this project focus on the construction of the domain ontology base and automatic indexing. From the user retrieval information in the patent logs and the structural features of patent text, this project proposes a terminology discovery method based on query feature, a term mining method based on domain coupling, and a patent text representation model based on structure-register nets and concept-word space. Constructing patent ontology by mining the domain concept, then achieving the measure of conceptual similarity, this method would further improve the integrality and accuracy of the description of patent by indexing word. By the research of the project, patent-based ontology can be constructed to make the patent text more comprehensive in semantics and more affluent in essence, so that more
英文关键词: Patent text;Patent term;Patent ontology;Patent knowledge;Patent indexing