项目名称: 面向网络异构信息源的问答资源挖掘
项目编号: No.61073127
项目类型: 面上项目
立项/批准年度: 2011
项目学科: 武器工业
项目作者: 刘秉权
作者单位: 哈尔滨工业大学
项目金额: 11万元
中文摘要: 项目旨在研究从异构的网络资源中自动地获取知识,并以问题-答案对的形式加以表现和利用的方法。项目的具体研究内容包括:问答语义相关性量化评价、网络社区结构化非文本特征的挖掘和应用、事实性问题的答案模板自动学习以及相关的问答知识检索系统构建等。项目通过采用Deep Learning的模型架构解决了以问答对为代表的短文本语义相关性量化的问题,其研究结论对于短文本信息挖掘研究具有普遍意义。项目对网络社区中的社会化信息在问答资源挖掘中的应用进行了较为深入的研究,研究表明合理的引入非文本特征对于问答信息的定位具有十分重要的作用。通过对网络半结构化和结构化知识的挖掘研究,本项目对事实性答案模板自动学习进行了初步的探索。此外,通过将理论研究成果与实际应用相结合,项目组开发了若干在线实用原型系统。本项目的实施为自动问答技术的进一步研究积累了一定的理论经验和语料资源,同时也为短文本信息处理研究的深入打下了基础。
中文关键词: 问答资源挖掘;异构信息源;问答语义相关性;非文本特征;事实性答案模板
英文摘要: This project aims to promote the research on mining the knowledge presented by the form of question-answer (QA) pairs from the heterogeneous web information source automatically. The detail research topics include semantic relevance quantifying for QA pairs, mining and utilizing of structural non-textual features in web communities, automatical learning of factoid answer templates, and the related knowledge retrieval system building. Introducing the Deep Learing model, the problem of semantic relevance quantification for QA pairs is partly solved, and the research conclusions on it is commonly meaningful to all the studies oriented to the short texts. This project has taken deep studies on the non-textual features, which indicates that they are able to make greate contributions to detecting the QA pairs in the web communities. By taking the research on the mining of the semi-structured and structured knowledge in the web, this project has presented the preliminary studies on the learning of factoid answer templates. In addition, according to the theoretical research results, some online prototype systems are also built and applied. The implementation of this project has accumulated large amounts of theoretical experience and corpora, and has laid the foundation for the deep research in the short-text information processing.
英文关键词: Question-answer resource mining; heterogeneous web information source; semantic relevance of QA pairs; non-textual features; fatoid answer template