面向Web文本的属性和属性值知识获取方法研究

项目名称： 面向Web文本的属性和属性值知识获取方法研究

项目编号： No.61272361

项目类型： 面上项目

立项/批准年度： 2013

项目学科： 自动化技术、计算机技术

项目作者： 张春霞

作者单位： 北京理工大学

项目金额： 80万元

中文摘要： 概念及其实例的属性和属性值知识获取是Web文本挖掘和信息抽取中的前沿性课题。属性和属性值知识是本体的核心组成部分，是构建语义Web的重要基础，也是实现知识共享和互操作的前提。属性和属性值知识获取已成为制约信息检索和文本分类等智能信息处理技术发展的瓶颈。现有的研究工作主要是从结构化网页、以列表型文本为主的半结构化网页中抽取显式类型的属性和属性值，相关方法往往受限于特定的领域、概念或属性。针对这些问题，本项目将系统地研究从Web文本中获取概念和概念实例的属性和属性值知识的理论模型和核心方法，具体包括：(1)属性和属性值知识在Web文本中的表达模型和方法；(2)属性和属性值的多维分类体系；(3)具有领域自适应性的显式和隐式的属性和属性值知识的抽取和学习方法；(4)属性和属性值知识的验证方法。在此基础上，开发一个概念和概念实例的知识获取平台，并在该平台上评估和分析提出的知识抽取、学习和验证的方法。

中文关键词： 属性；属性值；知识获取；知识验证；Web挖掘

英文摘要： Automatic acquisition of attributes and their values of concepts and instances is one of the research frontiers in the fields of web content mining and information extraction. Knowledge of attributes and their values is a crucial component of ontology, a basis of building the Semantic Web, and a condition of realizing knowledge sharing and interoperability.This kind of knowledge has become a bottleneck of hindering the development of intelligent information processing techniques such as information retrieval,text classification and text clustering. Current works mainly focus on how to extract explicit attributes and their values from structural web pages and semi-structural web pages with item lists. Moreover,present methods are usually restricted by specific domains, concepts or attributes. To solve these problems, this project will systematically study theoretical models and core algorithms of acquiring attributes and their values from web texts. The research contents include:(1) constructing expressing models of attributes and their values in web texts;(2) building a multi-dimension classification framework of attributes and their values;(3) designing a domain adaptive approach to extracting and learning explicit and implicit attributes and their values;(4) devising a verification approach of knowledge about

英文关键词： attributes；attribute values；knowledge acquisition；knowledge verification；web mining

成为VIP会员查看完整内容

相关内容

属性

关注 1

一个具体事物，总是有许许多多的性质与关系，我们把一个事物的性质与关系，都叫作事物的属性。事物与属性是不可分的，事物都是有属性的事物，属性也都是事物的属性。一个事物与另一个事物的相同或相异，也就是一个事物的属性与另一事物的属性的相同或相异。由于事物属性的相同或相异，客观世界中就形成了许多不同的事物类。具有相同属性的事物就形成一类，具有不同属性的事物就分别地形成不同的类。

自然语言处理中的文本表示研究

专知会员服务

58+阅读 · 2022年1月10日

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

专知会员服务

98+阅读 · 2021年12月30日

面向语义搜索的自然语言处理

专知会员服务

60+阅读 · 2021年12月18日

面向知识图谱的知识推理综述

专知会员服务

152+阅读 · 2021年11月1日