项目名称: 基于多语用户模型的个性化跨语言信息检索研究
项目编号: No.61300129
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 周栋
作者单位: 湖南科技大学
项目金额: 27万元
中文摘要: 网络上信息资源表达语言的多样性与普通网络用户个体所掌握语言的局限性,阻碍了人们对多渠道信息的充分获取。跨语言信息检索已成为解决这一矛盾的必要方法。但目前跨语言信息检索相关研究尚未能较好地考虑用户的兴趣偏好,因此难以获得高置信度以及个性化呈现的跨语言信息检索结果。针对这一问题,本项目拟结合跨语言信息检索和个性化信息检索技术,探讨提高跨语言信息检索准确率和用户满意度的方法。项目将具体研究基于概率主题模型的混合建模方法以构建粒度级多语言用户兴趣模型;基于词图网络的半监督机器学习方法以翻译和扩展查询词;基于语义抽取的结构化分析技术以个性化排序和优化搜索结果;最后通过对所提出的理论与算法进行实验测试和系统原型的实现来验证所提方法的效能。
中文关键词: 跨语言信息检索;个性化信息检索;查询扩展;搜索结果优化;用户模型
英文摘要: The linguistic diversity apparent in information resources on the web combined with the limitation of languages mastered by ordinary web users, hinder full access to the wealth of multi-channel information that is available. Cross-language information retrieval has become a essential approach to resolving this contradiction. Despite this, current studies give limited consideration to users' multilingual interests, and therefore may not be able to achieve high accuracy and personalized result presentation. To solve these problems, this project intends to combine personalized information retrieval and cross-language information retrieval, exploring related technologies to improve personalized multilingual search accuracy and user satisfaction. The main research that will be conducted in this project includes: a hybrid method based on probabilistic topic models for granular multilingual user interest model building; a semi-supervised machine learning method based on word graphs for query translation and expansion; a structured analysis technique based on semantic extraction for personalized results ranking and optimization. Lastly, we verify the effectiveness of the proposed methods through experimentation with the theories and algorithms which are developed in addition to the implementation of a prototype system.
英文关键词: cross-language information retrieval;personalized information retrieval;query expansion;search results optimizaiton;user model