多重代价失衡的机器学习技术研究

项目名称： 多重代价失衡的机器学习技术研究

项目编号： No.61272222

项目类型： 面上项目

立项/批准年度： 2013

项目学科： 自动化技术、计算机技术

项目作者： 杨明

作者单位： 南京师范大学

项目金额： 81万元

中文摘要： 代价风险最小化是目前有效的分类决策判别准则之一，倍受国内外机器学习和模式识别研究者的关注。降维和代价敏感学习是改进代价失衡分类器性能的有效策略，但当前同时进行多重代价（错分代价、属性代价等）最小化的分类器研究还不多见，而多重代价失衡在数据不平衡等分类问题中普遍存在。为此，本项目旨在寻求特征选择的新策略、多重代价最小化的分类模型设计及多重代价最小化的集成分类器设计三个方面展开研究。侧重研究：1）提出基于假设间隔、信息熵、Filter-Wrapper模型的特征选择新方法，构建出多重代价最小化新准则下的特征选择新算法；2）提出局部结构保持的监督（半监督）特征选择新算法,探寻并入多重代价最小化的新策略；3）设计嵌入代价敏感学习策略的监督（半监督）分类模型；4）设计并入代价敏感学习和特征选择的分类器模型。以上述研究为基础，进而研究1）一类多重代价最小化的分类器设计；2）多重代价最小化的分类器的集成。

中文关键词： 代价敏感学习；特征选择；数据不平衡；多标记分类；字典学习

英文摘要： Cost risk minimization is one of the effective classification decision-making criteria recently, many researchers in the world focus on it in machine learning and pattern recognition research fields. Both dimensionality reduction and cost-sensitive learning are the effective strategies to improve the performance of the unbalanced cost-based classifier, but currently very little work has been done to simultaneously minimize multiple costs(misclassification cost, attribute cost or test cost, wait cost and etc) for classification, while multiple costs imbalance is widespread in many real classification issues such as the classification problem of imbalanced datasets. Therefore, this project aims to seek some new feature selection strategies, mutiple costs minimization-based classifier design and mutiple costs minimization-based integrated classifier design. Concretely, the project focuses on: 1)introducing new feature selection methods based on hypothesis-margin or information entropy or Filter-Wrapper model, and then proposing novel feature selection approaches based on those newly developed multiple costs minimization criteria; 2)giving new supervised(semi-supervised)feature selection algorithms by employing local structure preserving, and seeking new strategies that can effectively embed multiple costs minimizat

英文关键词： cost-sensitive learning；feature selection；data imbalance；multi-label classification；dictionary learning

成为VIP会员查看完整内容

相关内容

特征选择

关注 5931

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

【AI+军事】附PPT 《前瞻性分析：获得决策优势的方法》

专知会员服务

98+阅读 · 2022年4月17日

机器学习中原型学习研究进展

专知会员服务

47+阅读 · 2022年1月18日

【NeurIPS 2021】强大图表示的重建

专知会员服务

17+阅读 · 2021年10月4日

【KDD2021】基于生成对抗图网络的不平衡网络嵌入

专知会员服务

27+阅读 · 2021年9月10日