项目名称: 半监督排序的局部学习算法设计与推广性能研究
项目编号: No.61300143
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 潘志斌
作者单位: 华中农业大学
项目金额: 23万元
中文摘要: 排序是信息检索、协同过滤等大量现实应用的核心问题,它已经成为机器学习领域中新的研究热点。由于现实应用中往往存在大量未标记样本,因此本课题主要研究半监督排序问题。当前的半监督排序算法主要是基于流形学习的,此类算法是全局性的,其性能依赖于图的构造,计算的空间与时间复杂度都比较高,难以应用于大规模现实问题之中。如何降低半监督排序算法的计算复杂度是亟待解决的难题。本课题拟运用统计学习理论中局部风险最小化的思想来设计和分析局部化的半监督排序算法,阐明局部化策略及其优化方法对排序学习速度的影响机制。主要研究内容包括:设计局部化的半监督排序算法模型,深入研究基于实际问题背景的核函数的构造,设计局部化稀疏正则化半监督排序学习算法,探讨其快速求解技巧,分析其一致性与收敛性,建立基于假设函数空间容量的推广误差界。本课题首次将局部学习与稀疏学习的思想引入半监督排序之中,有望丰富统计学习理论并推动排序应用的发展。
中文关键词: 排序学习;正则化排序;推广性分析;系数正则化;多尺度核
英文摘要: Ranking is the central problem of many real applications such as information retrieval and collaborative filtering and so on, and it has become a novel hot research topic in machine learning community. Due to the large amount of unlabeled samples in real applications, this project is scheduled to study the semi-supervised ranking problem. The current semi-supervised ranking algorithms are mainly based on manifold learning which is global and their performance depends on the constructed graphs. The time and space complexity of manifold ranking is too high, which makes it difficult to apply to large scale real-world problems. Therefore, how to reduce the computation complexity of semi-supervised ranking has become one difficult problem to be solved. This project aims to utilize the idea of local risk minimization in statistical learning theory to design and analyze localized semi-supervised ranking algorithm, demonstrate the influence of localization strategy and its optimization technique on the learning rate of ranking. Our main research topics include designing localized semi-supervised kernel ranking algorithm, constructing the kernel function based on the background of real problem, designing local and sparse semi-supervised ranking algorithm, studying its fast optimization technique,analyzing the consistency
英文关键词: learning to rank;regularized ranking;generalization analysis;coefficient-based regularization;multiscale kernel