项目名称: 基于非参数随机森林的分类预测方法及其应用
项目编号: No.71201139
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 管理科学与工程
项目作者: 方匡南
作者单位: 厦门大学
项目金额: 19万元
中文摘要: 随机森林(RF)是一种非参数分类预测方法,是预测科学重要的研究领域之一,是未来预测方法重要的发展方向之一,也是目前统计学、数据挖掘的最热门的前沿研究领域之一。从理论上,本项目重点研究RF如何更有效处理因变量是多分类变量以及多因变量的情形、以及该方法预测的稳健性探讨、带惩罚项的RF,基于lasso和group lasso的分类预测模型的变量选择等问题。从应用上,提出基于随机森林方法的信用卡信用风险识别模型,利用lasso和group lasso方法筛选指标体系,建立可靠的风险预测模型。提出基于随机森林回归的保险业利润贡献度预测模型,引入责任准备金,有效预测与挖掘高质量客户。提出基于随机分位数回归森林的金融市场风险VaR预测模型,不仅考虑了自身变量的信息,而且考虑其他相关变量对其的影响,并综合了多个预测结果,提高VaR的预测精度。
中文关键词: 随机森林;分类预测;非参数;变量选择;
英文摘要: The random forest (RF) is a non-parametric classification forecasting method. It is not only one of important research area of forecasting Science and the future development direction of forecasting methodologies, but also is an frontier and hot area of Statistics and Data Mining Science. From the view of theory, this proposal focuses on the improvement of Random Forest (RF), especially on how to deal with the multi-nominal dependent variable and multi-dependent variables more effectively,and the discussion of robustness of the proposed prediction method , and on the discussion of improved Random Forest with the penalty,and on the variable selection method based on the lasso and group lasso. From the view of application, we propose a Credit risk identification model based on random forest method using the lasso and the group lasso to choose variables and to establish the indicator system and a reliable risk prediction model. This proposal also propose a prediction model of insurance industry profit contribution based on random forest regression method,introducing of liability reserves and effective forecasting and mining of high-quality customers. In addition,this proposal propose a financial market risk VaR prediction model based on random quantile regression forests, which not only consider the its own la
英文关键词: Random Forest;classification forecasting;Nonparametric;Variable selection;