项目名称: 大数据环境下基于GMDH的客户分类半监督集成模型研究
项目编号: No.71471124
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 管理科学
项目作者: 肖进
作者单位: 四川大学
项目金额: 60万元
中文摘要: 客户关系管理(CRM)及其客户分类问题是现代企业管理理论的核心内容之一,然而大数据环境下的客户分类为CRM带来了挑战。一方面,用于建模的有类别标签的样本通常比较少,但却有大量没有类别标签的样本可用,同时客户数据往往是高维的,另一方面,客户数据中包含大量噪声。区别于仅使用有标签数据建模的传统研究范式,本项目提出了同时使用有、无类别标签数据来建模的半监督分类的新研究范式,并给出大数据环境下基于GMDH的客户分类半监督集成的概念及研究框架。在该框架下,以GMDH具有的较强抗噪声干扰能力和自动建模机制为基础,研究了客户分类半监督学习机制,提出了两种基于GMDH的半监督特征选择模型以及两种基于GMDH的单一半监督分类模型,构建了三种基于GMDH的代价敏感半监督集成选择策略。最后,针对不同的客户分类问题,给出最适合的半监督分类集成解决方案并做实证研究。研究成果将为大数据时代CRM提供一种有效的工具。
中文关键词: 客户分类;大数据;半监督分类;GMDH;集成学习
英文摘要: Customer relationship management (CRM) and its customer classification is one of the mordern enterprises management's key contents. However, customer classification with big data brings challenges for CRM. On one hand, there are relatively few samples with class label for training model, but a lot of samples without class label are available, meanwhile the customer data tend to be high-dimensional; on the other hand, the customer data contain lots of noises. Different from the traditional research paradigm which only utilizes the labeled data to model, this project proposes a new research paradigm, i.e., semi-supervised classification, which utilizes labeled and unlabeled data simultaneously, and provides the concept and research framework of GMDH based semi-supervised ensemble for customer classification with big data. Under the framework, based on the strong anti-noise ability and automatic modeling mechanism of GMDH, it researches the customer classification semi-supervised learning mechanism, provides two GMDH based semi-supervised feature selection models and two GMDH based single semi-supervised classification models, and constructs three GMDH based cost sensitive semi-supervised ensemble selection strategies. At last, it provides the most appropriate semi-supervised classification ensemble solutions for different customer classification problems and conducts empirical research. The research productions will provide an effective tool for CRM in big data time.
英文关键词: customer classification;big data;semi-supervised classification;GMDH;ensemble learning