项目名称: 多重代价失衡的机器学习技术研究
项目编号: No.61272222
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 杨明
作者单位: 南京师范大学
项目金额: 81万元
中文摘要: 代价风险最小化是目前有效的分类决策判别准则之一,倍受国内外机器学习和模式识别研究者的关注。降维和代价敏感学习是改进代价失衡分类器性能的有效策略,但当前同时进行多重代价(错分代价、属性代价等)最小化的分类器研究还不多见,而多重代价失衡在数据不平衡等分类问题中普遍存在。为此,本项目旨在寻求特征选择的新策略、多重代价最小化的分类模型设计及多重代价最小化的集成分类器设计三个方面展开研究。侧重研究:1)提出基于假设间隔、信息熵、Filter-Wrapper模型的特征选择新方法,构建出多重代价最小化新准则下的特征选择新算法;2)提出局部结构保持的监督(半监督)特征选择新算法,探寻并入多重代价最小化的新策略;3)设计嵌入代价敏感学习策略的监督(半监督)分类模型;4)设计并入代价敏感学习和特征选择的分类器模型。以上述研究为基础,进而研究1)一类多重代价最小化的分类器设计;2)多重代价最小化的分类器的集成。
中文关键词: 代价敏感学习;特征选择;数据不平衡;多标记分类;字典学习
英文摘要: Cost risk minimization is one of the effective classification decision-making criteria recently, many researchers in the world focus on it in machine learning and pattern recognition research fields. Both dimensionality reduction and cost-sensitive learning are the effective strategies to improve the performance of the unbalanced cost-based classifier, but currently very little work has been done to simultaneously minimize multiple costs(misclassification cost, attribute cost or test cost, wait cost and etc) for classification, while multiple costs imbalance is widespread in many real classification issues such as the classification problem of imbalanced datasets. Therefore, this project aims to seek some new feature selection strategies, mutiple costs minimization-based classifier design and mutiple costs minimization-based integrated classifier design. Concretely, the project focuses on: 1)introducing new feature selection methods based on hypothesis-margin or information entropy or Filter-Wrapper model, and then proposing novel feature selection approaches based on those newly developed multiple costs minimization criteria; 2)giving new supervised(semi-supervised)feature selection algorithms by employing local structure preserving, and seeking new strategies that can effectively embed multiple costs minimizat
英文关键词: cost-sensitive learning;feature selection;data imbalance;multi-label classification;dictionary learning