大规模数据的个性化分类学习

项目名称： 大规模数据的个性化分类学习

项目编号： No.61263032

项目类型： 地区科学基金项目

立项/批准年度： 2013

项目学科： 自动化技术、计算机技术

项目作者： 范自柱

作者单位： 华东交通大学

项目金额： 45万元

中文摘要： 在对大规模复杂数据如图像、网页和视频分类学习时，传统的分类学习方法如子空间学习处理这些数据时往往效果不佳。其中一个重要原因是传统的分类学习方法针对性不强，它们不能有效学习大规模复杂数据的结构。为有效处理和学习这些大规模复杂数据，本研究尝试提出一种新颖的学习框架：个性化学习。它的基本思想是，针对不同的测试样本，采取不同的模型或不同的训练样本学习并分类。由于在大规模复杂异构数据中，不同类别之间的边界大都是高度非线性的，本研究拟提出的个性化分类学习的一个主要目的就是尽可能准确地找到这些边界，来达到正确分类，这是传统的学习方法较难实现的。本研究的个性化学习结合局部鉴别分析、集成学习和稀疏表示分类等理论，对它们的学习机制进行深入研究；对每个测试样本，寻找一个最优的学习策略，从而得到一个全新的学习框架。本研究的成功开展，将会大大丰富机器学习和模式识别的基础理论。

中文关键词： 机器学习；特征抽取；核方法；鉴别分析；稀疏表示

英文摘要： When learning and classifying large-scale complex data such as images,web pages and videos, the traditional classification approaches such as subspace learning approaches usually do not achieve the desirable classification performance. One of the very important reasons why these approaches do not obtain the good results is that their learning procedures are not well-directed and purposeful. As a result, these traditinonal approaches can not effectively learning the structure of the large-scale complex data. In order to deal well with and effecively learn the large-scale complex data,this study tries to propose a novel learning framework,i.e.,individualized learning. Its basic idea is that for different test samples, idividualized learning uses different models or different training samples to learn and classify them.Indeed,the boundaries among different classes are often highly non-linear in the large scale complex and heterogeneous data. The individualized learning this study will propose aims to find these boundaries as precisely as possible, and perform correct classification. On the contrary, the traditional methods can hardly achieve this goal. The individualized learning in this study combines the theories of the local discriminant analysis, ensemble learning and sparse representation learning and so on. F

英文关键词： machine learning；feature extraction；kernel methods；discriminant analysis；sparse representation

成为VIP会员查看完整内容