项目名称: 面向搜索引擎的用户个性化查询意图分析
项目编号: No.61202277
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 陈毅恒
作者单位: 哈尔滨工业大学
项目金额: 22万元
中文摘要: 查询意图分析是信息检索研究中一个非常重要的课题,对于改善搜索引擎性能以及用户搜索体验有着十分重要的作用。然而在当前的查询意图分析研究中,用户个性化信息并没有得到充分利用。为此本项目提出了一种融合用户个性化信息的查询意图分析方法。具体地,本方法包含以下几个主要方面:(1)提出了在查询意图分析模型中使用用户个性化信息作为特征,旨在使查询意图分析的结果体现出不同用户的差异性;(2) 提出了基于用户自然标注资源的共性查询意图识别与挖掘方法,即能识别宏观的用户查询意图,又可自动挖掘细粒度的查询意图;(3)提出了基于话题模型的个性化用户检索兴趣建模方法,可以更好的学习用户模型,改善查询意图分析的效果;(4)将查询类别信息作为查询意图识别的重要特征加以利用,并提出了基于URL的查询分类算法,可以极大提高查询分类的效率;(5)将本项目提出的查询意图分析方法应用于检索结果聚类,即围绕多种不同的查询意图对搜索
中文关键词: 查询意图分析;用户个性化建模;查询分类;检索结果聚类;
英文摘要: Analysis of web query intents is an important research topic in Web Information Retrieval, which plays a crucial role for improving the performance of search engines and user searching experience. However, there is little attention paid to the individual information in previous research work. Therefore, this work proposes a query intent analysis framework by incorporating personal information. Specifically, our approach focuses on: 1) modeling query intents by utilizing personal information as features, which aim to identify specific query intents for each individual user. 2) Query intent identifing and mining methods based on user naturally annotated Web resources. The proposed approaches could both indentify query intents in a macro view and automatically mine fine-grained query intents. 3) Topic model based personalized search framework, which can effectively learn user interests and improve the effectiveness of analyzing query intents. 4) Query classification algorithm based on URL topic, which could improve the efficiency of query classification and further be used as an important feature for identifying query intents. 5) Retrieval results cluster algorithm based on identifying query intents, specifically, the algorithm can cluster researching results and present them according to identified personalized qu
英文关键词: Query Intent Analysis;Personalized User Modeling;Query Classification;Search Result Clustering;