项目名称: 基于内容分析和行为分析的社区问答关键技术研究
项目编号: No.61272332
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 赵军
作者单位: 中国科学院自动化研究所
项目金额: 80万元
中文摘要: 问答系统是自然语言理解和信息检索领域的重要研究课题,然而受限于自然语言处理和人工智能技术的水平,目前自动问答系统能够解决的问题类型非常有限,难以满足真实用户的个性化复杂信息需求。随着Web2.0的兴起,基于用户生成内容的互联网服务越来越流行,如果能对海量社区问答数据进行有效挖掘和利用,并和深层问答技术结合,将有可能有力地推动问答技术的发展。本申请以社区问答数据的有效挖掘利用为总目标,从分析社区文本内容以及用户行为两方面入手,针对社区问答系统的四项关键技术展开研究:(1)基于空间压缩和语义知识扩展的短文本问题的大类别分类;(2)基于最短路径融合的新类别标签动态生成;(3)基于高鲁棒性短语翻译模型和大规模图结构挖掘的问答对检索;(4)基于用户兴趣建模和行为弱标记学习的最佳回答者推荐。以上研究成果一方面可以直接应用于社区问答系统,提升其智能化水平;另一方面也为自动问答系统的发展产生重要影响。
中文关键词: 社区问答;问答系统;信息抽取;;
英文摘要: Question answering is a significant research direction in the field of natural language understanding and information retrieval. However, due to the development of natural language processing and artificial intelligence, automatic question answering can only solve limited types of questions. Therefore, it is difficult to meet the complex information needs for different users. With the surging of Web 2.0, user-generated content becomes more and more popular; how to effectively mine and utilize the web scale community question answering data, and combine these techniques with deep question answering, will greatly enhance the development of question answering. This project aims to effectively mine and utilize the community question answering data,analyzing the text content of questions and answers and the behavior information of community users. Based on the above analsis, this project focuses on the following key techniques: (1)large scale short text classification based on space compression and semantic knowledge expansion;(2)new category label dynamic generation based on shortest path;(3)question-answer retreival based on robustness phrase tranlation and large sclae graph mine;(4)best answerer recommendation based on user interest modeling and manifold ranking learning.The above achievements can not only be dire
英文关键词: Community Question Answering;Question Answering;Infomaiton Extraction;;