项目名称: 基于用户评价准则的排序学习算法及理论研究
项目编号: No.61203298
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 自动化学科
项目作者: 兰艳艳
作者单位: 中国科学院计算技术研究所
项目金额: 26万元
中文摘要: 随着互联网信息的快速增长,搜索引擎成为辅助用户获取信息的重要手段,其核心是排序。近年,排序学习技术被大规模应用到搜索引擎,取得了巨大的商业成功,同时推动了排序学习学科的发展,使其成为学术界广泛关注的热点研究问题。排序学习的根本是反映用户对于给定查询下文档满足其信息需求的顺序关系的评价。然而,无论是排序学习的算法,训练数据所依赖的标注方式,还是排序学习的目标,均与用户评价准则不一致,这成为排序学习面临的最大挑战。针对上述挑战,本课题拟以互联网搜索为应用场景,研究基于评价准则的排序学习算法的统计一致性以指导建立融合位置信息的排序学习算法,研究基于点对间偏好关系的标注策略以指导建立基于相对相关度的标注方式,研究基于概率图模型的排序学习框架以指导建立融合相关性和多样性目标的排序学习框架。本课题的研究,能够帮助建立更符合用户评价准则的排序学习模型,推动排序学习技术在互联网中更为广泛有效的使用。
中文关键词: 排序学习;评价准则;统计一致性;多样性;
英文摘要: With the fast growth of the World Wide Web, the search engine has become an essential tool for people to get information efficiently and effectively, the central of which is ranking. Recently, learning to rank techniques have been extensively applied to search engine, and gain great success in bussiness. At the same time, it promotes the progress of research on learning to rank, which has gain great attention from research communities. The essence of learning to rank is to represent a user's evaluation to the ordering of documents satisfying user's information need for a given query. However, the learning to rank algorithm, labeling method and goal is not consistency with evaluation measure from user's information need perspective, which make learning to rank a challenging problem in real application. In face of the above challenges, we will study the statistical consistency of ranking algorithms in terms of IR evaluation measures to design new learning to rank algorithms taking position information into consideration, the labeling strategy based on pairwise relative preference to biuld relative labeling method, and the learning to rank framework with graphical model to biuld a new learning to rank frmework which merges the goal of relevance and diversity together. We adopt web search as the application in this
英文关键词: Learning to Rank;Evaluation Measures;Statistical Consistency;Diversity;